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General Notes drawn up for the Guidance of Authors on the 


Preparation of Manuscripts for Biometrika 





Papers should be headed by an informative title, by the name or names of the author(s), 
and where appropriate by the name and address of the laboratory or institution where 
the work was performed. Manuscripts should normally be typed in double spacing 
with wide margins and on one side of the paper only; they should be on sheets of 
uniform size, numbered consecutively. In addition to the top copy it is editorially 
convenient, though not essential, that a second copy of the paper be submitted to the 
journal. It is helpful if a shortened version of the title, not exceeding 65 letters and 
spaces, or about eight words in length, could be supplied for a page-head. 


Special care should be taken with mathematics, and any complicated mathematical 
expressions should be inserted very clearly in the manuscript. Where algebraic expres- 
sions are written in, particular attention should be given to distinguishing between 
capitals and lower-case letters, e.g. S, s, etc. Where Greek letters or unusual symbols 
are used it may help the printer if they are distinguished by writing or underlining in 
coloured ink, and a key provided for the compositor. Alternatively each unusual letter 
may be indicated by name in the margin at the first mention, e.g. ‘~ = Greek mu’, 
‘Ry = italic capital R and sub-script italic capital F’. Letters in script should also be 
indicated in the margin. Equation numbers should be placed after, not before, the 
equation. 


Tables, except very small ones, should be on separate sheets of paper, since they are 
not set up at the same time as the body of the text; the position at which they are to be 
inserted should be indicated on the manuscript. They should have headings or footnotes 
which make their general meaning comprehensible to the reader without reference to 
the text. 


Illustrations should be kept to a minimum. The author’s name and the number of 
the figure should be written on the back of each, with legends on a separate sheet. 
Diagrams that are not drawn by professional draughtsmen generally have to be re-drawn 
and re-lettered before blocks are made, and for this reason it is advisable to draw 
illustrations carefully in pencil from which the printer will make good ink drawings. It is 
most important that no mistake should be made in any part of the author’s original 
drawing. 


The Harvard system is used for references in this journal, references being collected 
at the end of the paper in alphabetical order of author. Each reference must give name 
followed by initials of author, the year of publication in brackets, title of paper, journal 
title abbreviated in accordance with the World List of Scientific Periodicals, volume 
number in arabic numerals with wavy line to indicate black type and the number of the 
first and last page in arabic numerals. 


Papers should be submitted in their final form so that proof correction is confined 
to a minimum. 























BIOMETRIKA PUBLICATIONS 


Issued by the Cambridge University Press, Bentley House, London, N.W. 1 
and obtainable from any bookseller 


Tables of the Incomplete B-Function Epitep By KARL PEARSON 
59 pages of Introduction and 494 pages of Tables Price: 55s. net 


Tables of the Incomylete r-Function EDITED By KARL PEARSON 
31 pages of Introduction and 164 pages of Tables Price: 42s. net 


Tables of the Complete and Incomplete Elliptic Integrals 
(from LEGENDRE’S Traité des Fonctions Elliptiques. With autographed portrait of LEGENDRE) 
39 pages of Introduction by KARL PEARSON and 94 pages of Tables Price: 12s. 6d. net 


Tables of the Ordinates and Probability Integral of the Distribution of 
the Correlation Coefficient in Small Samples By F. N. DAVID 
38 pages of Introduction, 55 pages of Tables, 10 Diagrams and 4 Charts Price: 17s. 6d. net 


Biometrika Tables for Statisticians, Vol. I 
EDITED By E. S. PEARSON and H. O. HARTLEY for the Biometrika Trust 
102 pages of Introduction and 136 pages of Tables Price: 25s. net 


The Life, Letters and Labours of Francis Galton, Vols. I, I, Mla, & Ils 
By KARL PEARSON, F.R.S. Price: £5. 5s. net 
Karl Pearson: An Appreciation of Some Aspects of his Life and Work 


By E. 8S. PEARSON Price: 15s. net 


A Bibliography of the Statistical and Other Writings of Karl Pearson 
COMPILED BY G. M. MORANT, with the assistance of B. L. WELCH 


Price: 6s. net 


**Student’s” Collected Papers EpITED By E. S. PEARSON and 
JOHN WISHART witha ForEworp By LAUNCE McMULLEN Price: 21s. net 


Karl Pearson’s Early Statistical Papers 


Reprinted by photo-lithography for the Biometrika Trust, with the permission of the original publishers. 
The Volume contains eleven papers, including the more important of the memoirs entitled “‘ Mathematical 
Contributions to the Theory of Evolution”’, first published in the Philosophical Transactions of the Royal 
Society. The original paper deriving the x*-distribution, published in 1900 in the Philosophical Magazine, is 
also included. Price: 25s. net 























PUBLICATIONS OF THE DEPARTMENT OF 


STATISTICS, UNIVERSITY COLLEGE, LONDON 


Issued by the Cambridge University Press, Bentley House, London, N.W. 1 
and obtainable from any bookseller 


TRACTS FOR COMPUTERS 


. Tables of the Digamma and Trigamma Functions. By ELEANOR PAIRMAN, M.A. 


: J 1 
Tables forsumming S = 5 —¥—_._.— —.— where the p’s andq’s are numerical 
€ PACKET ACRES 1+: (Pri tn) . . 


factors. Price 5s. net. 


. Table of Coefficients of Everett’s Central-Difference Interpolation Formula. By A. J. 
THOMPSON, PH.D. Second edition. Price 7s. 6d. net. 


. Table of the Logarithms of the Complete I'-Function (to ten decimal places) for Argument 
2 to 1200 beyond Legendre’s Range (Argument 1 to 2). By Econ S. Pearson, D.Sc. 
Price 5s. net. 


. Log T (x) from x = 1 to 50-9 by intervals of 0-01. By JoHN BROWNLEE, M.D., D.Sc. 
Price 5s. net. 


. On Quadrature and Cubature or on Methods of Determining Approximately Single and 
Double Integrals. By J. O. IRwin, D.Sc. Price 7s. 6d. net. 


. Tables of the Probable Error of the Coefficient of Correlation. By KARL HOLZINGER, PH.D. 
Price 5s. net. 


. Bibliotheca Tabularum Mathematicarum, being a Descriptive Catalogue of Mathematical 
Tables. Part I. A, Logarithms of Numbers. By JAMEs HENDERSON, PH.D. Price 9s. net. 


. Random Sampling Numbers. By L. H. C. Tippett, M.Sc., with a Foreword by KARL 
PEARSON. Price 5s. net. 


. Tables of tan-!x and log (1+ x”). To assist in the calculation of the ordinates of a Pearson 
Type IV curve. By L. J. Comrig, PH.D. Price 5s. net. 


. Random Sampling Numbers (2nd Series). By M. G. KENDALL and B. BABINGTON SMITH. 
Price 5s. net. 


. Random Normal Deviates. By HERMAN WOLD. Price 5s. net. 


. Correlated Random Normal Deviates. By E. C. FiELLer, T. Lewis and E. S. PEARSON. 
Price 10s. 6d. net. 
Nos. II, III, 1V, VI and VII are out of print 





—> 
LOGARITHMETICA BRITANNICA 


A standard Table of Logarithms to Twenty Decimal Places. By A. J. THOMPSON, Ph.D. 
(commenced in 1922 to commemorate the tercentenary of the publication of HENRY BRIGGS’S 
~ Arithmetica Logarithmica). 
sell The nine separate sections of this Table have now been issued, and the complete work 
oyal consisting of the logarithms of numbers 10,000-100,000, together with Dr Thompson’s 
ne, is General Introduction (98 pp.) is available in two bound volumes. 


s. net Price £8. 8s. Od. 























NEW STATISTICAL TABLES: SEPARATES RE-ISSUED 
FROM BIOMETRIKA 


To be obtained from 
BIOMETRIKA OFFICE, UNIVERSITY COLLEGE, LONDON, W.C.1 


|. From Biometrika, Vols. 22, 27 and 28 
Tests of Normality. By E. S. PEARSON and R. C. GEARY Price 2s. 6d., post free 


ll. From Biometrika, Vol. 32, pp. 168-181 and 188-189 
(1) Table of percentage points of the incomplete beta-function 
(2) Table of percentage points of the x? distribution 
Stitched together with introductory matter. Price 2s. 6d., post free 


Ill. From Biometrika, Vol. 32, pp. 300-310 
(1) Table of the probability integral of the range in samples from a normal population 
(2) Table of the percentage points of the range 
(3) Table of the percentage points of the t-distribution 
Stitched together with introductory matter. Price 2s. 6d., post free 


IV. From Biometrika, Vol. 33, pp. 73-88 
Table of percentage points of the inverted beta (F) distribution 
With introductory matter. Price 2s. 6d., post free 


V. From Biometrika, Vol. 33, pp. 252-265 


(1) Table of the probability integral of the mean deviation in samples from a normal 
population 


(2) Table of the percentage points of the mean deviation 
Stitched together with introductory matter. Price 2s. 6d., post free 


Vi. From Biometrika, Vol. 33, pp. 296-304 
Table for testing the homogeneity of a set of estimated variances 
With introductory matter. Price 2s., post free 


Vil. From Biometrika, Vol. 35, pp. 145-156 
Table of significance levels for the Fisher-Yates test of significance in 2x2 contingency 
tables. By D. J. FINNEY With introductory matter. Price 2s. 6d., post free 


Vill. From Biometrika, Vol. 35, pp. 191-201 


Table for the calculation of working probits and weights in probit analysis. By D. J. FINNEY 
and W. L. STEVENS With introductory matter. Price 2s. 6d., post free 


IX. From Biometrika, Vol. 36, pp. 267-289 
Tables of autoregressive series. By M.G. KENDALL With introductory matter. Price 2s. 6d., post free 


X, XIV, XVIII, and XX. From Biometrika, Part 1 from Vol. 36, pp. 431-449, Parts 2 and 3 from 
Vol. 38, pp. 435-462, Part 4 from Vol. 40, pp. 427-446, and Part 5 from Vol. 42, pp. 223-242 
Tables of symmetric functions. By F. N. DAVID and M. G. KENDALL 


With introductory matter. Price 14s. 6d., post free 
(Part 1, 2s. 6d.; Parts 2 and 3, 4s.; Part 4, 4s.; Part 5, 4s.) 


XI. From Biometrika, Vol. 39, p. 190 and Vol. 43, pp. 449-451 
Tables of percentage points of the extreme “Studentized” deviate from the sample mean. 
By K. R. NAIR and H. A. DAVID With introductory matter. Price 1s., post free 
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NEW STATISTICAL TABLES: continued 


XII. From Biometrika, Vol. 37, pp. 168-172 and pp. 313-325 
(1) Table of the probability integral of the t-distribution 
(2) Table of the x? integral, and of the cumulative Poisson distribution. By H. O. HARTLEY 
and E. S. PEARSON Stitched together with introductory matter. Price 5s., post free 


XIll. From Biometrika, Vol. 38, pp. 112-130 
Charts of the power function for analysis of variance tests, derived from the non-central 
F-distribution. By E. S. PEARSON and H. O. HARTLEY 
With introductory matter. Price 2s. 6d., post free 
XV. From Biometrika, Vol. 38, pp. 423-426 
A chart for the incomplete beta-function and the cumulative binomial distribution. By H. O. 
HARTLEY and E. R. FITCH With introductory matter andruler scale. Price 2s. 6d., post free 


XVI. From Biometrika, Vol. 40, pp. 70-73 
Tables of the angular transformation. By W. L. STEVENS 
With introductory matter. Price 1s., post free 
XVII. From Biometrika, Vol. 40, pp. 74-86 


Tests of significance in a 2x2 contingency table: extension of Finney’s table (No. VIl). 
Computed by R. LATSCHA With introductory matter. Price 2s. 6d., post free 


XIX. From Biometrika, Vol. 41, pp. 253-260 
Tables of generalized k-statistics. By S. H. ABDEL-ATY With introductory matter. Price 2s., post free 


XXI. From Biometrika, Vol. 42, pp. 494-511 


A new form of table for significance tests in a 2x2 contingency table. By P. ARMSEN 
With introductory matter. Price 2s. 6d., post free 


XXII. From Biometrika, Vol. 43, pp. 388-403 
Tables for certain applications of sequential methods in the analysis of variance. By W. D. RAY 
With introductory matter. Price 2s. 6d., post free 

XXIll. From Biometrika, Vol. 43, pp. 423-435 


Table for determining confidence limits for a proportion in binomial sampling. By EDWIN L. 
CROW With introductory matter. Price 2s. 6d., post free 


XXIV. From Biometrika, Vol. 44, pp. 411-419 
Tables for estimating the normal distribution function of normit analysis. Part I. Tables and 
description of their use. By JOSEPH BERKSON With introductory matter. Price 2s. 6d., post free 


XXV. From Biometrika, Vol. 44, pp. 482-489 
Table of significance points for a two-sample t-test based on range. By P. G. MOORE 
With introductory matter. Price 2s. 6d., post free 
XXVI. From Biometrika, Vols. 44 & 45 


Tables of the upper percentage points of the generalized beta distribution. By F. G. FOSTER 
and D. H. REES With introductory matter. Price 5s., post free 


XXVII. From Biometrika, Vol. 46, pp. 178-204 


Tables of 1000 standardized random deviates from certain non-nermal distributions. By M. H. 
QUENOUILLE. With a note by E. S. Pearson With introductory matter. Price 2s. 6d., post free 





SEPARATES RE-ISSUED FROM BIOMETRIKA 


To be obtained from 
BIOMETRIKA OFFICE, UNIVERSITY COLLEGE, LONDON, W.C.1 


The Biometrika Office has available for sale copies of many of the papers which have been published in Biometrika (right back to Volume 1). Readers 
requiring offprints of any papers are invited to enquire from the above address whether they are still available. The following are a few examples 
of fairly recent separates of which copies may be obtained: 


From Biometrika, Vol. 39, pp. 324-345 
Rank Analysis of Incomplete Block Designs |. The Method of Paired Comparisons. By R. A. 
BRADLEY and M. E. TERRY Price 4s., post free 


From Biometrika, Vol. 41, pp. 502-537 
Rank Analysis of Incomplete Block Designs II. Additional Tables of Paired Comparisons. By 
R. A. BRADLEY Price 5s., post free 


From Biometrika, Vol. 43, pp. 203-205 
Further critical values for the two-means problem. By W. H. TRICKETT, B. L. WELCH and G. §, 
JAMES Price 1s., post free 
From Biometrika, Vol. 44, pp. 1-8 
John Wishart, 1898-1956. Obituary Notice and Bibliography. By E. S. PEARSON 
From Biometrika, Vol. 44, pp. 490-514 Price 2: pa 
A bibliography on the theory of queues. By ALISON DOIG Price 5s., post free 
From Biometrika, Vol. 45, pp. 293-315 


THOMAS BAYES’S Essay towards solving a problem in the doctrine of chances. [Reproduced 
from Phil. Trans. Roy. Soc. 1763, 53, 370-418.] With a biographical note by G. A. BARNARD. 


From Biometrika, Vol. 45, pp. 521-543 Pvies, 5... gem 
A bibliography on life testing and related topics. By WILLIAM MENDENHALL Price 5s., post free 


BIOMETRIKA INDEX. Comprising Subject Index for Vols. 1-37 and Author Index for Vols. 
1-40, with Author Index Supplement covering Vols. 41-43. Price 6s. or $1.00, post free 














STATISTICAL EXERCISES 


Issued by the DEPARTMENT OF STATISTICS 
UNIVERSITY COLLEGE, LONDON, W.C.1 


Part I. Elementary Statistical Exercises. Compiled by F. N. DAVID (91 pp.) 


These exercises, published in 1953, were collected in connection with the numerical classwork 
undertaken by students during the first year of the B.Sc. Special Degree course in Statistics at 
University College. They deal with applications of the more elementary univariate and bivariate 
theory. Price: 6s. 6d. 


Part II. Statistical Exercises. Analysis of variance and associated techniques. (107 pp.) 
Compiled by N. L. JOHNSON 


This volume includes over 100 exercises on the following topics: 

Analysis of variance techniques for the following experimental designs : randomized block, Latin 
square, confounding, split plots, fractional replication. Linear and curvilinear regression. Dosage- 
mortality techniques. Discriminant functions. Time series. Curve fitting. Price 12s. 


{A supplementary volume containing suggested solutions to a substantial proportion of the exercises 
in Part II is in preparation.] 


Both Parts are reproduced by offset-litho processes from typescript and bound in stiff paper 
covers. Copies may be obtained from a bookseller or direct from the Department of Statistics, 
University College, Gower Street, London, W.C. 1. If ordered direct, payment must be made in 
advance and cheques made payable to University College London. 
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APPLIED STATISTICS 


A JOURNAL OF THE ROYAL STATISTICAL SOCIETY 


VoLuME VIII, No. 2 CONTENTS JUNE 1959 
Matching and Prediction on the Principle of Biological Classification. WILLIAM A. BELSON. 

Deferred Sentencing Schemes. I. D. HILL, GARETH HorsNELL and BERNARD T. WARNER. 

The Length of Cigarette Stubs. Percy G. Gray and ELIZABETH A. PARR. 

The Use of Edge-Punched Cards in Statistical Computation. Denis H. WARD. 

Designing a Budget Survey. W. F. F. KEMSLEY. 

Pseudo-Random Elements for Computers. E. S. PAGE. 

LETTER TO THE Epitor: Acceptance Sampling. 

MEETINGS OF SECTIONS OF THE ROYAL STATISTICAL SOCIETY. 

Book REVIEWS AND PUBLICATIONS RECEIVED. 


VoLuME VIII, No. 3 NOVEMBER 1959 
Adjusting Single Sampling Plans for Finite Lot Size. HuGo C. HAMAKER. 

A Reliability Study of Sensory Assessments. A. S. C. EHRENBERG. 

A Statistical Approach to Stores Auditing. J. S. JAMEs. 

Weekly, Monthly, and Quarterly Tolerances for Coke Quality. Denis H. WARD. 

A Problem of Subjective Classification in Industrial Medicine. J. R. ASHFORD. 

Shipping Costs and the Terms of Trade: Australia and New Zealand. MICHAEL CHISHOLM. 

NOTES AND COMMENTS. 

MEETINGS OF SECTIONS OF THE ROYAL STATISTICAL SOCIETY. 

Book REVIEWS AND PUBLICATIONS RECEIVED. 


APPLIED STATISTICS is published three times per year. The annual subscription is £1. 10s. (U.S.A. and 
Canada $5.00). Single copies 13s. post free. Orders should be sent to 


Oliver and Boyd Ltd., Tweeddale Court, 14 High Street, Edinburgh, 1 
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TECHNOMETRICS 


A Journal of Statistics for the Physical, 
Chemical and Engineering Sciences 


VoL. 1, No. 2 CONTENTS May 1959 
Measurements Made by Matching with Known Standards. W. J. YouDEN, W. S. CONNON and N. C. SEvERO. 
Random Balance Experimentation. F. E. SATTERTHWAITE. 

The Application of Random Balance Designs. THomMas A. BUDNE. 


Discussion of the Papers of Messrs Satterthwaite and Budne. W. J. YoupDEN, O. KEMPTHORNE, J. W. TUKEY, 
G. E. P. Box and J. S. HUNTER. 
Quick Analysis Methods for Random Balance Screening Experiments. F. J. ANSCOMBE. 


VoL. 1, No. 3 AuGusT 1959 
—— Estimators for the Normal Distribution when Samples are Single Censored or Truncated. A. CLIFFORD 
‘OHEN, Jr. 


Control Chart Tests Based on Geometric Moving Averages. S. W. ROBERTS. 

The Measuring Process. JOHN MANDEL. 

Factorial Experiments in Life Testing. MARVIN ZELEN. 

The Use of LaGrange Multipliers with Response Surfaces. A. W. UMLAND and W. N. SMITH. 


A Statistical Model for Evaluating the Reliability of Safety Systems for Plant Manufacturing Hazardous Products. 
Louis B. KAHN. 


Members of the American Statistical Association and the American Society for Quality Control may subscribe at a 
Special rate of $6.00 per year. Non-member subscriptions are $8.00 per year. Remittances, made payable to 
Technometrics, may be sent to the office of either society, as follows: 


AMERICAN STATISTICAL ASSOCIATION, AMERICAN SOCIETY FOR QUALITY CONTROL, 
ROOM 404, BEACON BUILDINGS, 6197 PLANKINTON BUILDING, 
1757 K STREET, NW, 161 Ww. WISCONSIN AVENUE, 


WASHINGTON 6, DC, U.S.A. MILWAUKEE 3, WISCONSIN, U.S.A. 









































The Annals of Mathematical Statistics 


The Official Journal of the Institute of Mathematical Statistics 


VOL. 30, No. 4 CONTENTS DECEMBER, 1959 
Some Validity Criteria for Statistical Inferences. RoBerT J. BUEHLER. 

Conditional Confidence Level Properties. Davin L. WALLACE. 

An Example of Wide Discrepancy Between Fiducial and Confidence Intervals. CHARLES STEIN. 

Optimum Invariant Tests. E. L. LEHMANN. 

The Weighted Compounding of Two Independent Significance Tests. M. ZELEN and L. S. Joe. 

Bayes Acceptance Sampling Procedures for Large Lots. D. Gutnrig, Jr. and M. V. JouNns, Jr. 

Optimum Tolerance Regions and Power When Sampling from Some Non-Normal Universes. IRwiIN GUTTMAN. 


Properties of Model II—Type Analysis of Variance Tests, A: Optimum Nature of the F-Test for Model II in the 
Balanced Case. LEON H. HERBACH. 


Some Remarks on Herbach’s Paper, ‘Optimum Nature of the F-Test for Model II in the Balanced Case’. WERNER 
GAUTSCHI. 


The Most-Economical Character of Some Bechhofer and Sobel Decision Rules. WM. JACKSON HALL. 
The Admissibility of Pitman’s Estimator of a Single Location Parameter. CHARLES STEIN. 

The Use of Sample Quasi-Ranges in Estimating Population Standard Deviation. H. LEoN Harter. 
The Joint Cumulants of True Values and Errors of Measurement. FREDERIC M. Lorb. 

Some Tests of Permutation Symmetry. R. WORMLEIGHTON. 

Contributions to the Theory of Rank Order Statistics—The One-Sample Case. I. RICHARD SAVAGE. 
The Distribution of a Generalized Dt, Statistic. MEYER Dwass. 

Null Distribution of the Hodges Bivariate Sign Test. JEROME KLOTZ. 

Exact Nonparametric Tests for Randomized Blocks. JoHN E. WALSH. 

A Generalization of Partially Balanced Incomplete Block Designs. B. V. SHAH. 

The Non-Existence of Certain PBIB Designs. MANOHAR NARHAR VARTAK. 


A Necessary Condition for Existence of Regular and Symmetrical Experimental Designs of Triangular Type, with 
Partially Balanced Incomplete Blocks. JUNJIRO OGAWA. 


Optimal Spacing in Regression Analysis. H. A. DAvip and BEvERLY E. ARENS. 


Third Order Rotatable Designs for Exploring Response Surfaces. D. A. GARDINER, A. H. E. GRANDAGE and 
R. J. HADER. 


Second Order Rotatable Designs in Three Dimensions. R. C. Bose and NORMAN R. DRAPER. 
The Probability in the Extreme Tail of a Convolution. Davin BLACKWELL and J. L. Hopags, Jr. 
Bounds on Normal Approximations to Student’s and the Chi-Square Distributions. DAvip L. WALLACE. 


Approximate Expressions for the Conditional Mean and Variance over Small Intervals of a Continuous Distri- 
bution. GUNNAR EKMAN. 


On the Moments of the Trace of a Matrix and Approximations to its Distribution. K. C. S. PiLLai and Tro A. 
MUJARES. 


Random Graphs. E. N. GILBERT. 

Scale Mixing of Symmetric Distributions with Zero Means. E. M. L. BEALE and C. L. MALLows. 
Measurability of Extensions of Continuous Random Transforms. Otto HANs. 

A Convolutive Class of Monotone Likelihood Ratio Families. S$. G. GHURYE and DAvip L. WALLACE. 
On the Laws of Cauchy and Gauss. R. G. Lana. 

Continuous Sampling Procedures without Control. C. DERMAN, M. V. JoHNs, Jr. and G. J. LIEBERMAN. 
Some Convergence Theorems for Stationary Stochastic Processes. T. KAWATA. 

Large Excursions of Gaussian Processes. MARK Kac and Davip SLEPIAN. 

The Capacity of a Class of Channels. DAvip BLACKWELL, LEO BREIMAN and A. J. THOMASIAN. 

Infinite Codes for Memoryless Channels. Davip BLACKWELL. 

NOTEs. 

CORRECTION NOTES. 

ABSTRACTS OF PAPERS. 

PUBLICATIONS RECEIVED. 


Subscription rate $12.00 per year in the United States and Canada and $10.00 per year elsewhere 


ADDRESS ORDERS FOR SUBSCRIPTIONS AND BACK NUMBERS TO 

PROFESSOR A. H. BOWKER, TrEAsuRER, INSTITUTE OF MATHEMATICAL 
STATISTICS, DEPARTMENT OF STATISTICS, STANFORD UNIVERSITY, 
STANFORD, CALIFORNIA, U.S.A. 
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JOURNAL OF THE 
AMERICAN STATISTICAL ASSOCIATION 


VoL. 54, No. 287 CONTENTS SEPTEMBER 1959 
NorMAN M. KAPLan—Some Methodological Notes on the Deflation of Construction. 

WILLIAM A. CROMARTY—An Econometric Model for United States Agriculture. 

PETER E. DE JANOsI—A Note on the Relationship between Earning Expectations and New Car Purchases. 
CHARLES WINDLE—The Accuracy of Census Literacy Statistics in Iran. 

RONALD H. BeatrLtE—Sources of Statistics on Crime and Correction. 

GorDON TULLOCK—Publication Decisions and Tests of Significance—A Comment. 
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AN INVESTIGATION OF HARTLEY’S METHOD FOR FITTING 
AN EXPONENTIAL CURVE 


By H. D. PATTERSON anp 8S. LIPTON 
Rothamsted Experimental Station 


INTRODUCTION 


Hartley (1948) showed that the parameters of an exponential regression curve 


Ely) =%—fp? (0<p<}), (1) 
where x = 0,1, 2, ...,n—1, can be estimated by considering the regression of y, on x and on 
certain partial sums of the y,. Since then, Stevens (1951) has described an iterative method 
for obtaining the least-squares estimates of a, # and p which is very suitable for routine 
applications when a high-speed computer is available and which, moreover, can be used 
when thez’s are not equally spaced.When computing facilities are limited, however, Stevens’s 
method is somewhat laborious. For this reason other methods, including Hartley’s, are 
still in common use. 

Hartley (1948) showed that his estimates of p are of high efficiency for very large » in 
the special case a = 0. It might reasonably be supposed that the estimates are also highly 
efficient for finite n and for « unknown although this does not necessarily follow. The purpose 
of the present paper is to discuss the efficiency and bias of these estimates and of other 
estimates of similar type using general formulae derived by Patterson (1958). 

We will not be concerned with estimates of « and # as these can always be obtained by 
the linear regressions of y, on r* where r is an estimate of p. The resulting fit depends on the 
magnitude of r—7, where 7 is the least-squares estimate. If r is subject to large errors or 
biases the fit may be very poor. On the other hand, if the errors and biases in r are small 
and the true relationship is in the form of (1) a good fit can always be obtained. 


LINEAR AND QUADRATIC ESTIMATES OF p 


n—1 
Provided that } w, = 0 the expression 
. S 
WzrYz 
recs 2 (2) 








n—-1 


~ WrYr-1 


provides consistent estimates of p. Several estimates of this type have been proposed (though 
not necessarily in this form). These include: 

(1) Stevens’s least-squares estimate 7 in which the w, are complicated functions of 7 itself. 

(2) Estimates in which A, and A, are both linear functions of the y,. These estimates 
will be called ‘linear estimates’. Suitable linear estimates with the w, not depending on p 
have been discussed by Patterson (1956) for the cases n = 4, 5, 6 and 7. 

(3) Estimates in which the w, are linear functions of the y,, so that A, and A, are quadratic 
functions of the y,. These estimates will be described as ‘quadratic estimates’. As wili be 
shown in the next section Hartley’s estimate of p is a quadratic estimate. 
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It is possible, by a suitable choice of the w,, to obtain both linear and quadratic estimates 
having minimum variance for some particular value of p, say p. The quadratic estimates 
with this property cover a wider range of p (around p,) with high efficiency than do the 
corresponding linear estimates but, as pointed out by Patterson (1958), they may be subject 
to large biases in some circumstances. 

Before Hartley’s estimates of p are considered in detail some properties of linear estimates, 
particularly those having minimum variance, will be presented. The variances and expecta- 
tions of Hartley’s estimates will then be determined for comparison with the minimum 
variance linear estimates. In the following it will be assumed that the y, are independently 
and normally distributed with variance o°. 

The expectation and variance of a linear estimate of r are approximately 


n—-1 P n—2 
Pp * Wz — = Wy Wy44 2 
&(r) = pt4- 5h (3) 


9 
= 


( n—1 n—2 
(1 +p”) 2 we — 20 > Wy Wi 
and var (r) = ER Se 


These expressions are suitable if o? is small relative to A}. 
Patterson (1958) has shown that for p < 1 the expression for varr is at a minimum when 





p= py if (1 + p§) Wz —Po(Wry1 + Wes) = ky(pF-1 + ky), (5) 


where x = 1,2,...,n—1; wy = w,, = 0; k, is given any convenient value such that the w, 
are not all zero and k, is such that Xw, is zero. A solution of equation (5) (with k, = 1) is 





Ww, = Edisps (7 = t), (6) 
2 Cig 2 Cis 
where dy, = ¢,—-*>——- 7 
i,j 
i-t1 _ net — p2n—) 

a i-a-s”) .._; 
and Cy = 6, = ——— > ae _:Sse (4 << 5). (8) 

Lo (1 — pg") (1 —p3) ¢<J) 


Equation (5) shows that, in this case (i.e. k, = 1), 
n—-1 n—-2 n—1 
(l+p?) 5 wi-2p Y ww = LD wpe? (9) 
1 I 1 


when p = py. Consequently, the variance of the minimum variance estimate is 


he 
var (r) = ie me BP (10) 
where the w, are as defined in (6). 


Linear estimates having fv.ii efficiency when p = p, can conveniently be denoted by r(/))- 
It is also possible to obtain estimates which tend towards full efficiency as p tends to 1. 
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These estimates will be denoted by r(1). As py approaches 1 the w, given by (6) tend to be 


proportional to } jd;; where x = i. When p, = 1 
lle 


Cj = Cy = i(n—j)ln (i <j). (11) 
Consequently r(1) is given by equation (2) with 
w, = psx(n — 2) (2a —n). (12) 


As pointed out by Patterson (1956) the linear estimates r(p,) are equal to 7, the least- 
squares estimates of p, when py = 7. The approximate formulae given above for the expecta- 
tion and variance of 7(p) are also appropriate for use with the least-squares estimate 7. 

Quadratic estimates of p with minimum variance when p = py are given by the ratio (3) 


wah w, = Xd; (ky; +ly,), (13) 


where x = 7,7 = 1,2,...,u—1, at least one of k and / is non-zero and the d;; are as defined 
in (7) and (8) for py < 1 or (7) and (11) for p, = 1. Only the value of p, and the ratio k/Il are 
required to specify these estimates. Thus, the k and / of (13) can be multiplied by any non- 
zero finite number without affecting the estimates of p. A convenient notation for quad- 
ratic estimates with minimum variance when p = fy is, therefore, r((, k/l). This differs 
slightly from the notation previously used by Patterson (1958). 

The expectation and variance of r(po, k/l) are approximately 


7 o (eS o* | (lp? + 2kp —l) F, — 2kF, 
&(r) = p+— {— al IO om teen thes OMe Bie. tO iaced 14 
= e+5| (k+1p) Fy *Bl (kt) = 
a een 
ld oo 15 
and var (r) i FP A (15) 
n—1 n—1 n—2 
where R= SP". A= DW, B= SE Wy (16) 
0 0 0 
W, = Xdypi, «=4, (17) 
j 
Dis the matrix of d;; defined by equations (7) and (8) or (7) and (11) and Uisann—1Ixn—-1 
matrix 000 0 0 
1 0 0 0 0 
U=|0 1 0 0 0 (18) 
Se ee 


The derivation of these expressions has been given by Patterson (1958). Thus the variance 
of r(p9, k/l) is independent of k/l. When p = p, (10) and (15) are identical. 

As will be shown later the family of estimates r(1, «/l) include Hartley’s estimate. The 
expectation and variance of r(1,k/l) will therefore be considered in rather more detail. 
For these estimates the d;; are given by (7) and (11) and 


traceD = {;(n?—4), traceDU = ¥5(2n?—15n + 22). (19) 


For the purpose of studying the bias and variance of r(1, k/l) when n is very large it is 
convenient to express the W, in the alternative form 


W, = a+bx+ca*+dp*, (20) 
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where d = — 1/(1—p)? anda, b and c are such that W, = W,, = ZW, = 0. For large n, constant 
6 = p™" and x = nz 














’ ly2_ Az 

We nop wall +biete2 &), (21) 
where b --2/2404 nd | (22) 
b +0 = —1+40. (28) 

b’ 2 Bie’) (1—0)(3—6) 
If L,= (- lt + 5+ +")- re (24) 

_  (1-@) 41-0)? +c” 

ond i, =- 2nd (Iné)? ’ (25) 
Bn? var (r)/o? > (In 6)? (L, — L,)/ Li. (26) 


It can also be shown, using the technique described by Stevens (1951, pp. 264-5), that 
n® var r(p) tends to limiting values for constant 0 as n is increased. Hence the efficiencies 
of r(1, k/l) for very large n depend only on @. 

The bias in r(1, k/l), when k+1 + 0, tends to 





ap {15(1—k) — 2(k+1 )in 9} 2In 6(L,-— a) (27) 
n* p 30(k +1) Ly TL 
as n increases with 6 constant. 
HARTLEY’S ESTIMATE OF p 
Hartley (1948)} noted that, if y = «a— fp”, then 
Y, = b,+b,%+b5Y,, (28) 

where Y, = Yp-1+ 42-1+ 42 (29) 

2Y) = —Yo— 2AYr t+ Yot --- + Y4n—2)) for even ‘ (30) 

= =—Yo—2AYrtYot---+Ypn-9) —Yin—v for odd n 


and p = (2+6s)/(2—6s). 


He therefore suggested that «, # and p could be estimated by fitting (28) to the data by least 
squares. He called this procedure ‘internal regression’. 
The observational equations can be written in matrix notation as 


y = 1,,6,+ xb, + Yds, (31) 


where y is a column vector with elements y,, 1,, is a column vector consisting of n 1’s and 
x is a column vector with elements x = 0, 1, 2, ...,n—1. The least-squares normal equations 
are then 


ly = nb, +1),xb,+1) Yb,, 
x’y = x'1,,b, + x’xb,+x’Yb,, (32) 
X’y = Y’1,,b, + Y’xb, + Y’Yb,. 

[t See also, Hartley’s note printed on p. 293 below. Ed.] 
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For the purpose of obtaining an explicit expression for b, (and hence for the estimate of p) 
it is convenient to introduce the matrices 


H. D. PatrEerson AND S. Lipton 


-—1 1 0 0 
0 -il 1 0 
T (of order n—1 xn) = 0 9 u's 
V=TT’, 
V-111’y 
V1” 

J, an x n matrix with diagonal terms equal to (n—1)/n and non-diagonal terms equal to 
—1/n, and 1, a column vector with n— 1 elements each equal to 1. It can be verified that 
TV"T=J 
and that the elements d;; of D are as defined by equations (7) and (11). 

Elimination of b, from (32) yields the two equations: 





D, = y-1 _ 


x’ Jy — i AO a (33) 


X'Jy = Y’Ixb, + Y’JYb,. 


Since x’T’ = 1’, Ty = y,—y, and 2TY = y,+y,, where y, is a column vector (Y, 3, ---;Y%n—2) 
and y, is a column vector (¥,, Yo, -.-, Yn_1), the regression b, can be written 





Es 2(Yo+ ¥1) Di(¥1— Yo) 
3 = 


(¥o +3) Dil¥o + Yi) 


Hence the estimate of p is 


© 





bo| bo 


+bs _ (Yor Yi) Diya (34) 
2—bs (Yo+ Yi) Di Yo 
This is equivalent to equation (2) with the w, given by (13), p) = 1, and k/l = 1. Con- 
sequently, Hartley’s estimate of p is equal to r(1, 1), one of the family of quadratic estimates 
which tend towards full efficiency as p tends to 1. 

It will be noted that the estimation of p does not depend on the definition of Y). Nor does 
the estimate of «, given by —b,/b,. Alternative estimates of # can, however, be obtained 
depending on Y,. With Y, given by equation (30), £p*-» is estimated by 


(—},/bs) — 6, — (n— 1) 6/2. 
If Y, is taken to be zero the estimate of # is —[b,/b,]—},. 


ALTERNATIVE INTERNAL REGRESSION METHODS 


The reasons behind Hartley’s choice of equations (28), (29) and (30) were not made clear in 
the original paper (Hartley, 1948), and other quadratic estimates of p obtained by ‘internal 
regression’ methods have sometimes been used as alternatives to r(1,1). Finney (1958) 
considered the simpler equations 


Yo = (1—p)+PYn-1 (35) 


l-p l-p 
and Yoi1—Yz = 2 Tap Ue teh (36) 
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which lead to r(0, 00) and r(0, 1), respectively, as estimates of p. In a detailed investigation 
of the case n = 4 he found that Hartley’s estimate of p showed no special advantage over 
r(0, co) and r(0, 1) and suggested that this result might also be true for larger values of n. 

Other estimates, in the same family r(1, k/l) as Hartley’s estimates, can be obtained from 
(28) by replacing equation (29) by 


a = Sud . kyn-1 + ly, k a l - 0. (37) 


The estimate of p in this case is (1+ kb,)/(1—1b,), the estimate of « is —b,/(k+1) b, and if 
Y, is taken to be zero the estimate of # is —[b,/(k+1)b,]—6,. An estimate r(1, k/l) with 
k+l = Oexists but cannot be obtained from equation (28). In the introduction to his paper, 
Hartley (1948) appeared to be considering r(1,00) rather than r(1, 1) (see equation (6) of 
Hartley’s paper). Monroe (1949), reported by Anderson (1956), developed procedures for 
the case r(1, 0). Anderson (1956) noted that different estimates (sometimes very different) 
are obtained by Hartley’s and Monroe’s procedures. White (1956) has also considered 
alternative estimates in the family r(1, k/l). 
A more general relationship than (28) can also be used (Patterson, 1958). This is 


Ya = by * bopo + bsY,, 
where Y,—PoYr-1 = ky, +ly,. 
The estimate of p in this case is r(pp, k/l). It is given by (p, + kb,)/(1—1b,). Estimates of this 


type have the advantage over r(1, k/l) that they can be arranged to have full efficiency within 
the range of values of p occurring in practice. 


NUMERICAL RESULTS 


The variances and biases of the estimates r(1, k/l) relative to the least-squares estimates 7 
have been investigated numerically using the formulae quoted in previous sections and the 
results are set out in Tables 1-10 printed together at the end of this paper. The tabulated 
values S and 7’ are defined by: 


standard error of r = So/f, (38) 
bias in r = &(r)—p = Bo*/f? (39) 
and T = BIS. (40) 


The tables are given in terms of 0 = p”—! rather than p because large values of p within the 
range 0 < p < 1 tend to become more important as n increases. If” is changed by changing 
the interval but not the range of the independent variate then p”~! remains constant. Also, 
n’varr and n*biasr tend to non-zero limiting values as n is increased with @ constant 
(0 < @ < 1) both for r = 7 and for r = r(1, k/l). 

Values of S for 7 are set out in Table 1 for various 0 and n = 4 to 7, 9, 12 and 20. Corre- 
sponding values for the estimates r(1, k/l) are given in Table 2; these values are independent 
of k/l. As n increases the Sn? tend to limiting values which are set out in Table 9. The 
efficiencies of r(1, k/l), given by the ratios of S? for 7 and r(1, k/l), are set out in Table 3 and, 
for very large n, in Table 4. 

The estimates r(1, k/l) tend towards full efficiency as 0 approaches 1 but this is of no value 
in itself as the standard errors of even the efficient estimates tend to be very large in this 








this 
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region. They are, however, also highly efficient over a wide range of smaller values of 0. 
In fact the efficiencies only tend to become low for large n and very small values of 6. 

The generally high efficiencies of r(1,k/l) compare very favourably with the efficiencies 
of other non-efficient estimates of p. Thus: 

(a) The estimates r(0,k/l) are very efficient when n = 4, but their efficiencies fall off 
rapidly when v is increased (Patterson, 1958). 

(b) The efficiencies of linear estimates, r(p,), of the type considered by Patterson (1956) 
fall off relatively quickly as the difference between p and py, increases. Table 4 shows that, 
for very large n, r(1, k/l) covers a rather wider range of 0 with high efficiency than two linear 
estimates together: r(1) and r(p,) with pj—! = 0-0071. The simplicity of calculation of linear 
estimates is still useful in practice, however, for example as initial estimates in the iterative 
least-squares procedure of Stevens (1951). 

(c) Estimates r(pp, k/l) with suitably chosen py lying between 0 and 1 cover a wider range 
of # than r(1, &/l). Thus if pj— is taken to be 0-0071 the estimate is 95 % efficient (for very 
large n) for 0 > 0-000075 whilst the corresponding limit for r(1, k/l) is about 0-0015. Very 
small values of 0 appear, however, to be rare in practice. 

The values of 7' set out in Tables 5-8 give, when multiplied by o/, the ratios of the biases 
to the standard errors of 7 and of r(1, k/l), with k/l = 0, 1 and oo. Unlike the variances the 
biases in r(1, k/l) depend on k/l. Limiting values of 7'n are given in Table 9. Negative values 
of T indicate that p is underestimated. 

The biases in Hartley’s estimate r(1,1) (Table 7) are often considerably greater than 
those in the least-squares estimate 7 (Table 5). For many purposes, however, they can 
probably be ignored as they are not large unless the estimates of p have large standard errors 
and are consequently already unsatisfactory. For example, if n = 12, 0 = 0-001 (i.e. 
p = 0-53) and o = 0:1f the standard error of r(1,1) is + 0-08 and the bias is about 0-01. 
o/f8 would have to be about 0-42 for the bias to be one-half of the standar error, in which 
case the standard error would be as high as 0-32. 

The biases in the estimates r(1, 0) and r(1, 00) are generally much larger and except when 
n = 4 there is no point in using either of these estimates in preference to r(1,1). If n = 4 
the biases in r(1,00) are smaller than the biases in 7(1, 1) but even this is of little value as 
much simpler estimates, e.g. (0, 2-75), with as high efficiency and very low bias are available. 

Biasin Hartley’sestimate may become serious if averages of several independent estimates 
of p are required. The bias can. of course, be estimated from the tables, provided that an 
accurate estimate of o// is available. For this purpose o requires to be determined from var- 
iation between replicates. This method is likely to deal satisfactorily only with small biases. If 
the biases are large the approximations involved in calculating the 7’ and S may no longer be 
adequate and the estimate of / is likely to be poor. An alternative and generally more reliable 
method for dealing with bias is to replace r(1,1) by r(1,/l) with values of k/l given by 
Table 10. These values k/l are such that the biases in r(1, &/l) are zero. Great accuracy in 
k/lis not required. Thus forn > 5 asingle value of k/1, 1-5 say, can be used for most purposes. 

The standard errors and biases of functions of p can also be obtained using the tabulated 
values of S and 7’. Thus, for example, the function —(Inp)/m, where m is a sealing factor, 
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is often of greater interest than p itself. In this case if r is an estimate of p and A = —(Inr)/m, 
then var (A) = (6 ()y var (r) = wen (41) 
Ve Ndr Jf mp2 ’ 
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S.E.(r) So 











so that s.B. (A) = a aga (42) 
The bias in A is 
bias (A) = &(A)-A = {e (72) bias (r)+4 le (72) | var (r) (43) 
__ bias(r) , var(r) 
= (44) 
i.e. bias (A) = _ bias (r) S.E. (7) (45) 


s.E. (A) S.E. (r) 2p 


In these expressions all standard errors are taken to be positive. 

The efficiencies of A are, of course, identical with the efficiencies of r. The loss of efficiency 
of Hartley’s estimate for low 0 may be of some importance when the estimation of A is of 
primary interest. The total amount of information on A is proportional to p?(In p)?/var? = I 
which has its maximum when @ is small. Thus when n = 20 the values of J are as follows: 


8 0-000001 0-000125 0-001 0-008 0-064 
I 0-228 0-275 0-289 0-267 0-146 


Hence useful estimates of A are available when 6 is as low as 10~* and it would therefore 
be worth while in this case to use a better estimate than the 72 % efficient Hartley estimate. 

As in the case of the estimates of p, the expectation of the Hartley estimate of A is closer 
to the expectation of the least-squares estimate than are the estimates with k/l = 0 or oo. 
This can be readily seen from equation (45); the second term on the right-hand side is 
approximately the same for all four estimates. 

Thus similar recommendations can be made for the estimation of both p and A. If high- 
speed computers are available Stevens’s method should be used as it ensures a good fit in 
all cases in which a good fit is possible. If computing facilities are limited and the ordinates 
are equally spaced, Hartley’s method can be used and in most cases will be superior, in the 
sense of having high efficiency and relatively low bias, to many other inefficient estimates 
which have been suggested. A worthwhile modification that can be made is to replace the 
k/l = 1 in the Hartley procedure by k/l = 1-5 or some other convenient value determined 
from Table 10. This will often result in a reduction of bias. If the curvature is very marked 
so that most of the y’s lie near the asymptote it may be desirable to use the related estimate 
r(P, k/l) with p, not equal to 1. 


SUMMARY 


The efficiencies and biases of the estimates of p in the equation y = a — fp, obtained by the 
method of Hartley (1948), have been examined numerically using general formulae derived 
by Patterson (1958). It is shown that while Hartley’s method generally leads to estimates 
having relatively high efficiencies and small biases, and can therefore be regarded as 
adequate for many purposes, the same conclusion does not hold for similar methods which 
are sometimes used in place of the original Hartley method. 
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Note added in proof 


Another method of estimating p which deserves to be mentioned is to replace the w, 
of equation (2) by suitable functions u,+/rv, and solve the resulting quadratic equation 
inr. In the case n = 4, using u, = — 10, 5,5 and v, = —17, —5, 22, efficiencies of at least 
999% are obtained over the entire range of p and the estimates are close approximations 
tor. For n = 5 suitable values of u, and v, are u, = —16, 5, 7,4and —14, — 22, 10, 26. 
There are grounds for believing that in general high efficiency can be obtained at least 
as easily by this method as by the use of quadratic estimates. H.D.P. 








| co F ce pes Pies meee | EF. . 

0 n=4 | n=5 | n=6 | n=7 eee p= 12. | n= 20 

| | | 

| } ees Bees 
0-0 | 1-225 1-155 | 1-118 | 1-095 1-069 1-049 | = 1-028 
0-000125 1-267 1-200 | 1-149 1-100 0-996 0-847 0-561 
9-001 | 1-312 1-228 1-153 1-079 0-939 0-762 | 0-470 
0-008 1-421 1-287 1:166 1-056 | 0-871 0-668 0-382 
0-064 1-777 1-503 | 1-287 1-114 0-860 0-621 | 0-328 
0-216 | 2-560 2-035 1-668 1398 | 1-034 0-718 | 0-362 
0-512 5-024 3-772 2-981 2-437 1-744 1-178 0-575 
Table 2. Multipliers, S, of /f in terms of the standard error of r(1, k/l) 

‘Liielemimaa fi ES oe Pe Bae i oe 

0 n=4 2=6 n= 6 nm=7 | n=9 | n=12 n = 20 

| | | | | |. oe ee 

| 00 1-247 1-225 | 1-241 1272} 1-349 1-475 | 1-787 
0-000125 1-282 1-237 1-200 1-158 | 1-059 0-906 | 0-601 
0-001 1-322 1-250 1-182 1010 | 0-970 0-789 | 0-487 
0-008 1-424 1-295 1-176 1-066 0-861 0-676 | 0-386 
0-064 | 1-777 1-504 1-288 1-115 0-860 0-621 0-328 
0-216 | 2-560 2-035 1-668 1-398 1-034 0-718 0-362 
0-512 5-024 3°772 


2-981 2-437 1-744 1-178 0-575 
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Table 3. Percentage efficiencies of r(1. k/l) 


n=9 


63 
80 
88 
94 
98 


For 0 > 0-064 the efficiency is almost 100%. 


Table 4. Percentage efficiencies of linear and quadratic estimates for very large n 


Table 5. Multipliers T of o/f in the ratio (bias of *)|/(standard error of 7) 


——- 


0 


0-0 
0-000125 
0-001 
0-008 
0-064 
0-216 
0-512 


0-000001 
0-000125 
0-001 
0-008 
0-064 
0-216 
0-512 
1-000 


n=4 | 


0-204 
0-223 
0-253 
0-344 
0-633 
1-134 
2-451 


0-096 
0-059 
0-062 
0-119 
0-354 
0-743 
1-706 


Linear estimates 


r(Po)* 


* pi-1 = 0-007]. 


| n= 6 


0-056 
— 0-046 
— 0-061 
— 0-019 

0-187 

0-511 
1-266 





0-359 
0-978 





Quadratic estimates 


r(1, k/l) 


1(Po, k/l) 


* 





n=9 | n= 12 | n = 20 


0-019 0-010 | 0-003 
— 0-247 —0-361 | ~ 0-482 
~ 0-273 —0-376 — 0-464 
— 0-234 ~ 0-323 ~ 0-387 
— 0-055 ~0-151 — 0-225 

0-175 0-037 ~ 0-085 


0-628 


0-360 





cooooo 


@aooo0oood S 


-— ——— 


0 n=4 
| 
0-0 
0-000125 18-283 
0-001 9-479 
0-008 5-249 
0-064 3-585 
. 0-216 3-777 
0-512 5-975 


| 
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Table 6. Multipliers T of o/f in the ratio {bias of r(1, 0)}/{standard error of r(1, 0)} 





=| 


(T = o for all n) 


15-141 13-039 11-700 10-067 
9-051 8-417 7184 | 7-138 
5-650 5-608 5-472 | 5-052 
4-122 4-268 4-278 4-156 
4-315 4-493 4-527 4-436 
6-626 6-846 6-885 | 6-741 








| a=I2 n = 20 

| es 
8-676 6-800 
6-360 5-144 
4-742 3-942 
3-903 3-316 
4-187 3-579 
6-370 5-456 


Table 7. Multipliers T of o/f in the ratio {bias of r(1, 1)}/{standard error of r(1, 1)} 




















| | | 
0 | n=4 n=5 n=6 | n=T7 n=9 | n=12 n = 20 
| | 
| | | 
0-0 | 0-534 1-021 1-805 | 2-789 5-198 9-684 25-430 
0:000125 | 0-520 0-762 | 1-040 1-274 1-588 1-804 1-857 
0-001 | 0-522 0-655 | 0-812 0-852 1-091 1-182 1-165 
0-008 | 0-563 0-574 0-618 | 0-655 0-681 0-708 0-659 
0-064 0-779 0-644 0-580 | 0-541 0-489 0-439 0-359 
0-216 1-233 0-938 0-773 0-665 0-532 | 0-421 0-290 
0-512 2-520 1-842 1-449 1-192 0-877 | 0-626 0-358 
5 22 Pras oe | ee ees eee 
Table 8. Multipliers T of o/f in the ratio {bias of r(1, 00)}/{standard error of r(1,00)} 
g | 
6 | n=4 n=5 | n=6 | n=7 | n=9 n=12 | n=20 
| | | 
statins eer ee | | 
| | | 
0-0 — 0-356 —0-612 | -—0-564 — 0342 0-436 | 2-180 | 9-072 | 
0-000125 — 0-368 — 0-759 | —0:949 | —1-057 — 1-169 | — 1-232 | — 1-223 | 
0-001 | —0-374 = 0-838 | —1-098 — 1-150 —1:458 | —1-581 | —1-602 
0-008 — 0-374 | — 0-944 —1-282 | —1-500 — 1-709 | — 1-892 | —1-888 | 
0-064 | — 0-343 — 1-106 — 1-548 | — 1-823 — 2-112 —2-259 | —2-200 
: 0-216 — 0-294 — 1-364 — 1-966 | — 2-326 — 2-692 — 2-856 | — 2-744 
0-512 |; —0-244 — 2-204 — 3-272 — 3-900 — 4-516 — 4-778 | — 4-563 
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Table 9. Values of Sn? and Tn* when n is very large 





| 0000001 
0-000125 
| 0-001 
0-008 
0-064 
0-216 
0-512 


0-0 
0-000125 
0-001 
0-008 
0-064 
0-216 
0-512 

1-0 











: 
Sni | Trt 
‘ | a | 
r r(1,kil) | r r(1, 0) r(1, 1) r(1, 00) 
| | | | 
ok Me” eRe {, wer - ae 
1593 | | 1872 | | — 5-42 | 54-96 28-76 2-55 
| 90-2 95:1 —448 | 31-65 11-67 — 8-30 
66-0 68-1 — 3-82 24-92 7-06 — 10-79 
| 47-0 | 47-4 — 2-94 19-68 3-78 —12-13 
35-6 35°7 —1:86 | 16-99 1:77 — 13-46 
| 36-8 36-8 —125 | 18-50 1-06 — 16-38 
558 55-8 — 0-83 28-32 0-72 — 26-91 
Table 10. Values of k/! for zero bias in r(1, k/l) 
Sco et : ites SEES DA a a E 
n=4 n=65 n= 6 nw=T n=9 n= 12 one | n= 0 
|. : ie 
| 
2-50 2-67 4-20 914 | -10-91 | —344 |-180 | —1-00 
2-49 2-11 2-28 2-47 2-80 3-11 | 3-47 3-81 
2-53 1-92 1-92 1-97 2-06 2-15 2-23 2-31 
2-81 1-79 1-67 1-63 1-62 162 , 162 | 1-62 
4-18 1-88 1-59 1-48 1-40 1:35 | 1:30 1-31 
7-71 2-16 1-68 1-51 1:36 1:28 | 1-20 1-13 
19-6 2-54 1-83 1-58, 1-37 1-25 1-15 1-05 
co | 38-00 2-00 1-67 1-40 1-25 1-12 1-00 
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THE EFFICIENCY OF INTERNAL REGRESSION FOR THE 
FITTING OF THE EXPONENTIAL REGRESSION 


By H. O. HARTLEY 
Statistical Laboratory, Iowa State College 


Ten years ago (Hartley, 1948), I suggested a method of estimating the parameters a, £ 
and p of the exponential equation ars (1) 


by what I called ‘internal regression’. The notation of (1) is that used by Finney (1958) 
who compares my estimators with those arising from the direct least-square treatment of 
the auto-regression equation Yous = (1 —p) + py, (2) 
where the y; are the observed values at n regularly spaced points x = i = 1, 2,...,n. Finney 
attributes the use of this equation to Dr St C. 8. Taylor, deals fully only with the case n = 4, 
and considers both the model (a) where the errors in (1) are assumed independent and of 
constant variance, and (b) where this assumption is made for the errors in (2). My paper, 
giving examples of the response of crop-yields to independent applications of fertilizers, 
was mainly concerned with (a). In comparing the estimators arising from the direct least- 
square treatment of (2) with my method (which is essentially a least-squares method applied 
to the progressive sums of (2)) Finney states (p. 376): ‘I have been unable to understand 
what particular advantage can be claimed for this procedure on the evidence that Hartley 
supplies. Viewed as a regression calculation, it is considerably more laborious than either 
of those considered above, because of the various partial sums that must be formed. So 
far as its precision is concerned, the construction by way of regression is irrelevant because 
it takes no account of the pattern of the errors.’ 

I do not think that the full facts of the case justify this assessment, for not only do I 
discuss the pattern of errors (called by me residuals, see § 3-1), but I also point out certain 
properties of these residuals which are indicative of the relative efficiency of various methods 
of estimation. It was the study of these residuals which prevented me from using the least- 
square estimator arising from (2) under assumption (a). That is, I deliberately decided, on 
the strength of the error residual relationships summarized in my paper, against using the 
very estimator which appears to be advocated in the above passage by Finney. Thisestimator 
has been critically examined by Patterson (1958) and is considered as severely restricted in 
general use by Finney himself (1958, p. 387, line 4 from bottom) after his study of Patterson’s 
results. Moreover (p. 34, lines 6-16), I clearly state the motivation for my estimators—I am 
justifying their use by the evaluation of their large sample variances and efficiencies 
obtained by comparison with the large sample variance formulae of the maximum likelihood 
estimates. I agree, of course, that a good large sample performance of an estimator does 
not necessarily guarantee its efficiency for small samples, but I did no more than follow 
a common statistical practice in advocating estimation procedures on the grounds of 
efficient large sample performance. 

For the special case « = 0 and n large these large sample formulae are given for case (a) 
on pages 43—4 of my paper and graphically exhibited in Fig. 3, where it will be seen that the 
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efficiency of the method tends to 100% as p> 1 (my g = —nInp-— 0) and remains high for 
the practical range of p. At the time of publishing my paper similar large sample results for 
the estimates of ali three parameters in equation (1) were in the course of being worked out 
by a student of mine (Miss I. Trivedi) and her results were subsequently recorded in her 
Master’s Thesis (University College London, 1949). These results for large n again show 
the high efficiency of the method for the practical range of p. 

In the present issue there appears a paper by Patterson & Lipton (1959) who kindly 
allowed me to have a preview of their typescript. These authors give a most comprehensive 
evaluation of my method of estimation for both large and small n using the elegant formulae 
previously derived by Patterson (1958). 

It is of interest to compare the limiting efficiencies (for n - 00) obtained by them with 
those computed from Trivedi’s approximate formulae. The results are shown in Table 1 
and appear to be in good agreement. 


Table 1. Comparison of the percentage efficiency of an ‘internal regression’ estimator 





























| Computed from | Computed from Patterson’s 
| Trivedi’s formula and Lipton’s formula 
| g=-—In0 
| 0 % efficiency 0 % efficiency 
| 0-064 99 
| 3 0-0498 99-7 - — 
| 4 0-01832 99-2 a —_- 
| 5 0-00674 98-0 0-008 98 

6 0-002479 96-2 —_— — 

7 0-000912 93-8 0-001 94 

8 0-000335 90-9 — — 

9 0-000123 87-8 0-000125 88 

10 0-000045 84-6 -—— — 

| 0-000001 72 
Lcuiisntch | caddies 


However, for small values of n, Patterson’s and Lipton’s results on efficiency and bias 
could certainly not have been inferred from our earlier large sample results. Indeed, the 
extremely goo.’ performance of my estimator of p for small values of n, which was demon- 
strated by their calculations, was a pleasant surprise to me. White (1956) (a student of 
O. Kempthorne) had independently obtained some of Patterson’s formulae, but had not 
made a comprehensive numerical evaluation of efficiencies (or bias). 

Patterson and Lipton recommend the use of Stevens’s (1951) 100% efficient least-squares 
estimator where a high speed computer is available and I agree that where computing centres 
are faced with the fitting of this exponential curve sufficiently frequently, the writing of a 
High Speed Computer Program specifically for this computation and for model (a) is 
warranted. 

Powerful methods of analysis, such as those employed in the above paper, permit a 
detailed assessment of the relative merits of estimators. The question now arises, how 
sensitive are the more refined judgements to likely departures of the error pattern from the 
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assumed model. For example. when the first ordinate of the exponential curve (measured 
from the asymptote) is 20 times as large as the last ordinate or even larger (Patterson’s & 
Lipton’s 0 < 0-05), itis doubtful whether the assumption of constant variance will be strictly 
satisfied. Since it is only in this range of # that my estimator fails to be fully efficient, a 
re-examination of this drop of efficiency under variance heterogeneity may be desirable. 
Finally, this raises the important question of the ‘robustness’ of the two estimators. I am 
sure we are all agreed that as much information on the physical mechanism (generating the 
responses ¥) as one can possibly obtain, should be used as a guide to set up an error model. 
Thus in independent fertilizer trials one would usually be inclined towards (a), in time series 
phenomena such as growth curves towards (b) or some other stochastic alternative. How- 
ever, often a model (such as the model (a)) is accepted not because of definite evidence as 
to its validity but rather because of lack of evidence for a better alternative. In such situa- 
tions we must be prepared for (at least slight) departures from the accepted model and the 
question of the ‘robustness’ of our optimum estimators becomes of great importance. 
Patterson (1958) seems to have made a start with such studies and reports (p. 399) that he 
has found Stevens’s estimator fairly effective under the model (b). Finney (1958) also dis- 
cusses mixtures of the two models. However, there are clearly other alternatives to the two 
models and much remains to be done on this issue. 
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THE DISTRIBUTION OF MOMENT ESTIMATORS 


By L. R. SHENTON 
College of Science and Technology, Manchester 


1. In a previous paper (Shenton, 1958) we introduced a class of moment estimator 
depending on the sample moments. We now consider the sampling distribution of a moment 
estimator and give expressions for the first four cumulants, these suggesting that the 
distribution is asymptotically normal. As a special case, when the moment estimators 
depend on an infinity of moments, the cumulants are those of the maximum likelihood 
estimator (assuming one exists) and our expressions agree with those given by Haldane 
& Smith (1956). 

Our main purpose here is to treat the problem for the case of a single parameter in general 
and attempt an approach which will lend itself to the development of the multi-parameter 
estimation problem. Thus although we give an illustrative example we do this to indicate 
what complexity is to be expected in the method and not as a practial illustration, for 
most of these involve at least two parameters. The extension to several parameters involving 
simultaneous estimation is deferred, for there are several difficulties of complication to be 
overcome (see Haldane, 1953). 


2-1. Let P(x,@) be the probability of the variate « depending on the parameter 0. For 
the sample and population moments we write 


n n 
m, = dO xj/n, mF = > (x;—mM,)*/n, a (la) 
j=1 j=1 
fs = EmM,, pe = mk. (1b) 
It is assumed that the moments exist and they they are differentiable in some 6-interval. 
It is further assumed that the range of x is independent of 0. 


2-2. The qth moment estimator 0, (when it exists) satisfies the determinantal equation 


h()=0, ¢=4, (2) 
ron 0 M Mm «... Mg 
O Po My os Ma 
Oy 
h(O) =| 20 Pr Paste Pega} (3) 
au 
ap Ma Mata oe Mg 








This is an alternative form of the expression (5) given by Shenton (1958, p. 411) for the case 
of a single parameter. It may be derived by applying Schweins’s theorem in determinants 
(see, for example, Aitken (1946) or Muir (1906)) to the series expansion (truncated) for a 
moment estimator appearing in (13), p. 112, of my 1950 paper. It will be noted that with the 





exce 
con 


For 
inne 
note 


bo 


the 
whe 


in ( 


whe 


and 
the 


whe 


wm 


fro 


so 


if 2 


(3) 


ASE 
nts 
ra 
the 











L. R. SHENTON 297 


exception of the first row in (3), h(@) consists of elements which are functions of 6. It is 
convenient to expand /(@) by its first row in terms of cofactors as follows: 


h(0) = 5 m,4,(0) (4a) 
s=0 
= (mM). (46) 


For brevity .@, is written for.@,(9), and in (4b) we have introduced a notation for the vector 
inner product of the vectors m = [mo,m,,...,m,] and MW =[.My,.M,,...,M,]. It will be 
noted that m has q random components, for ws = 1. 


2-3. From (2) we can write the stochastic q-dimensional Taylor expansion for 0, about 
the point yw, and in fact = {exp (A(0/0m))} 4, | m=, A=M> (5) 


where M = m—, A an arbitrary vector. After the evaluation of the inner-product operator 
in (5) we have 


6, = SOM, 0), (6a) 
s=0 
where s! ®, = (M(6/am))*4,, (6b) 


and in (66) the partial derivatives of 0, are to be evaluated at m = ju, this being implied in 
the barred derivative symbol. For example, 


5 004 00, 004 
®, = M, 1a, i am +‘: +4 aD 
where * = 64 , ete. 
om, om, m= 


2-4. Expression for ®,. Differentiating (2) partially with respect to m, we have 








og oh() = om 
Fe, ap TMH 0 (= 12,045 M, = MG) (7) 
, z 06, 
from which M,+h; iin, = 0, (8) 
’ , - _@ % Om 
where from (3) h, = ag?) = (at . (9) 
We thus have 0, = —<MM)Ih,. (10) 


To find an expression for ®,, we require terms such as 624/6m,0m,, which could be found 
from (7) by further differentiation. This piecemeal procedure can be avoided, and indeed 


from (7), for arbitrary A 
ease) ~{AM), (11) 


so that (42 5 im Or. ) Ae =) - -(A ima <AM 


~ 455) Ain 
om 
if A is independent of m. Thus 


Pay od M DD. om 


19 Biom. 46 
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we 


so that substituting in (12) leads to 


ang) AZ) § =- one) Case ON -2( 45 apa = (13) 


and on putting ee = M and m = p, and using (8) we have 


210, = FMA + PROX (14) 








in which, after differentiating (3) twice 


; _ &h(g) _ OM ou gel 
at aes )- (4 (ca) re. 


The expressions (10) and (14) give the first two terms of the Taylor expansion for 0,, and 
they involve only terms in the sample moment deviations m, —/,, etc., and the population 
parameter 0. Expressions for ®, and higher terms may be found in a similar way. Thus for 
®, we operate on (13) with <A(@/dm)), replace A by M and m by pw. We thus arrive at the 
following expressions: 








310, = hs— Fh oy yrs ' 2a (at *) 
ii 20 
MM (uy - ; paras =i (162) 
-5¢ ? 00 00? 
13h hgh, — 24h} —hy hj 96h} —19hgh D 
4!@, = oo ey Ma? (MS) 


4 ad Th 33h 
MOH) = eee u< —)- Sth cnay (at? ‘ am 
. ite 
+g a) CM > *E CM MY (ue 5 (i : > (168) 
oe 02M ou 3 /OM OU (a Mee 
2= 3 im 29) 3a * =) 
- ely (160) 
ee -4(oe 6S M OP UL glib Ou (at 
ai O08 5) 0b? sa) -4« \ 30 368 004 
atM 
- (oe 568 al) (16d) 


It is of interest to note that the expressions in (9), (15), (16c) and (16d) follow immediately 
from the identity <M) = 0. 





3-1. Hapected values of combinations of the ®’s. We require the expected value of expres- 
sions such as (AM)’"<BM)*.... Now for arbitrary A 


y? y® 
Efexpy(M AY} = [tates a (17) 
where y,=E(XAY, X,=2°-p, (r=0,1,...,q). 











Th 


in 
su 


(13) 


(14) 


(15) 


and 
tion 
s for 
the 


16.) 


16b) 


16c) 


16d) 


itely 


pres- 


(17) 











L. R. SHENTON 
Thus &{(MA)* = y,/n, 
ECM A)* = y/n, 
EMA) = y,/n? + 3(n— 1) Y5/n3, > (18) 
ECM A)? = yy; /n* + 10(n— 1) Yaya/n4, 
ECMAY® = yyg/n® + (m— 1) {LOWS + 15 yg yrq}/n> + 15(m — 1) (n — 2) Y3/n’. 





Moreover, it is readily seen that 


Wi, = E{ZA,(x" — p,)}° 
= (Ay (1) caw <apy + (5) Aare? any? 


+(—1yee(,*) amrCart+(-De-1)<4my, (19) 


in which (Aw)’ is a symbolic multinomial expression to be interpreted by replacing a term 
such as pip" by /4,2,- For example, when q = 2, 


(Ap)? = A2+ Atuy+ ABM, + 2A, Aofs + 2Ag Agfa + 2A9 Aj fy. 
3:2. Derived expressions. To evaluate an expression such as 
6&{(M A) (MB) <MC) 
we merely express this as 6 (o “a> (B i (M A>’, 
0A 0A 


A, B, C being arbitrary vectors independent of m and 0. Thus from 





n&(MAY* = (Ap)*— (Ap? (20a) 
we derive né{M A) <M BY = <Ap) (Bp) — (Ap) (Bp). (20D) 
For example n&{(MM)<BM) =< Mp) (Bp), (21a) 
OM 0 
n&( MM) a) = (ay A < »)=- ay (216) 
n&(MMY? = - ACM 5) a. (21¢) 
where A = |pXo, fa, --+> Hag 


Similarly for third-order terms we find 


n®&{M A) <M B) (MC) = (Ap) <Bu) (Cp) — (Ap) (Bu) (Cp) — (Ap) Be) (Ce) 
— (Ap) (Bp) (Cpe) + XA) Bi) Cp), (22) 


eee w)- (Mp)? = oH): (23a) 


Sn) cen Shed) 


26MM My = (Mp) 
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Again, without going into details, it will be found that 


n3&(M M)* = 3(n—1) (Ahy)?+<¢ Mp4, (24a) 
0M 0 OM 
né(M My ¢ © oH) = —3(n—1)A%, (s +My »)- Mp es hn). 
(246) 


4:1. Moments of 6,. To work out the third and fourth standardized cumulants y, = k|ki, 
Y2 = K,/kz to order n- and n-, respectively, we require the expectations of the following 


terms: - 
Linear ®,; 


Quadratic 02, 0,0, 3, 0,4; 
Cubic ©}, O79,; 
Quartic Of}, D30,, O70}, OF0,. 


Omitting the algebra, we quote some of the results in Table 1. 





Table 1 
Symbol Expected value 
9 + onhe ; e oh) +00 ”) 
Yj =" fag 
ze 
. on 3A2 /0.M du 
we, ~ onthe he 20 "3 
- 3(n—1) A? (.Mp)* 
Ds nh? ht 





oo, _ (apy a [5d OM op 
2 nhs ah 00 20 
ai\ du)? OM Any? O(n-4 
al (Mp) 00 5 7-H to situa 
2 2 A* {15 
PD; D3 nips (" («4 aa) + + ey x +O(n~* 
OM ou 
(1 this table ‘¥ = h, ex “)- -<s ») . ae 20 


4:2. The moments 
HE (9,) _ é(O, —60,)8 


are now evaluated and it appears that 


+ Cid OM om =u 
0.) = 2 2 
13(0,) = -+ ag (20 +40 (as sgt) + +3A (s Kea a) 
~- fomou =| 
+ h, A? (a > + 3A°h, 30 om + h<Mp>*® 


— 2h, Mp)? e ») + 2am} +0O(n-%), 











with 
arise 
a Col 


a) 




















L. R. SHENTON 301 





(Ap) | 3A® a / 
ux(0,) = — ae +R Pp 5 ii +O(n~), (25) 
_ 3(n—1) A? CMp)* Aap)? 4 Ou C4 of 
a 4(9,) = nshi + nhs nshs 18¢ & mt +e 00 00 








45A8 ( o*u\? 430A? OM om ( MeN 3042 
n3h> 06? + ene 00 5) mt Bhs 


“ i Ka p>? ( ) — (My = "| 
+ aE? ee pyr <3 Hoh] +0 +O(n-4). (25c) 


4:3. We now find as first approximations to the standardized cumulants 


Bar Ms . va ~ (My? 








0 : 26 
V1( a) ~ /(nA8h3) ( a) 
(Mp>* 12.ap> 8 Ou OM Om 12¥ 
wld) ~ “aR aR a0) +70 20)|* TE 
+= s -_ OM om (ae wet 
a) ty 00 yo ee MX) 
124 aM eu 12 . ‘ 
ee AH ay p) ies (266) 


with a clear indication of asymptotic normality. If P(x, @) is linear in 0 (a case which would 
arise for a Gram—Charlier distribution, the parameter governing the corrective term) there is 
a considerable simplification in the moments, and indeed 











1y(9,) = 9+ O(n), (27a) 
A 0.M Om 
u8(6,) ~ 5+ marie AY — 2 gee ah, Mp)? u)+ 2am}, (276) 
~My 

Y1(9q) ~ pe (27c) 

OM OM OM 

é 3 ee 9 YF pent 
nY2(9q) (ayy = ee = 4 (27d) 

Tae ~ AR nV ik VE . 3 


It is of some interest to observe that the first-order term in &(9,) vanishes when the pro- 
bability is linear in 0; according to Haldane & Smith (1956) a similar property holds for 
maximum likelihood estimators in the case of linearity. We should expect this to be the 
case since (27a) holds for moment estimators of all orders (i.e. q = 1,2, ...) under certain 
regularity conditions. Haldane has suggested that estimators with this property should be 
described as being almost unbiased. 


5. Maximum likelihood estimators. If there is a moment estimator which gives the 
maximum likelihood solution (this may be the case for finite q, and in any event for g — 00 
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under certain limitations), then the expressions in (25)-(27) can be given in terms of the 
expected values of derivatives of log P(x, @). Thus (using = to indicate correspondence) 


(ast [a= tmx): (28a) 

(4%) Iu oFia6 Bee “a 

COM ae 

Ch Oee e led e 
e a) * -35 5 cp ae + alge} om - 96 pa (See) : = 





» [o \ . ¢{_n20A (AlogP)* _ ,4(2logP\*_ A® aP\? &P 
(4p) e w)=é{-a 5 (— 00) *™\-30-) ~F\a0) ae 


The derivation of most of these is straightforward, but for (28) we note that 





OlogP . 
where (MX) = My t+M,x+M@_x*+..., 


and (29) is derived by differentiating (27a), and (30) from (276). For the sampling moments 
of the maximum likelihood estimator we therefore find (aiter some non-trivial algebra) 
1 aPeP 
" \P? 00 06? 


&(8) = 0- ‘(evan , (n-*), 


a | sae! aP\22P 1 0POP\ |... 
0X0) = sar +6{( 00 Pi\z0) a08~ P8230 08| | 


GleRaiay CR Iercow om 


(33a) 




















y,(0) = [e(“= )- ~36( Faq age) / Vind) + O(n), (33c) 
wrt) = (MEE) free IZA ane) (2822 
~46(Fa59 aga)| [8-3 +00 (33d) 
where I= a(° = =) 


These expressions agree exactly (allowing for notational differences) with those given in (6) of 
Haldane & Smith (1956, p. 100). It is to be remarked that the forms given are not necessarily 
the simplest, and that there are alternatives for such terms as &{(1/P?) (@P/00) (°P/06*)} 
in terms of the expected value of powers of (2 log P/00) or their derivatives (in this connexion 
reference may be made to Bartlett (1953) who gives an ingenious method of deriving such 
identities). 
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6. Illustrative example. In Shenton (1958) the asymptotic variances are given for the first 
three moment estimators of 6 in 


P, = e-{2-—04+(0-1)2} (0<a<a,1<0< 2). 


Using the results in (27) and (33) we find for y, and y, for the first three moment estimators 
and the maximum likelihood estimator the following: 


Jn y1(9,) ~ (208 — 126? + 240 — 12)/( 02 + 40 — 2), (340) 
ny2(0,) ~ (— 604 + 4808 — 1446? + 1920 — 84)/(— 62 + 40 —2)?; 
\ny1(92) ~ (12808 — 86405 + 218404 — 260668 + 147662 — 3000 — 16) /(«f)i, 
nyY(Op) ~ (— 614469 + 59,90408 — 24345607 + 568,8960% — 901,94405 + 1,064,70604; (340) 
— 934,51263 + 558, 1446? — 194,9280 + 29,340)/(a3f?), 
where a=40-3, P=—40+ 1567-12042; 
(ny3(93) ~ pl(qryt, } 
ny2(93) ~ p’/(q*r*), 
where p = 675009 —51,30068 + 180,50407 — 411,2800° + 689,47205 
— 854,40064 + 738,6880% — 411,8406? + 131,7120 — 18,304, 


(34c) 


q = 150?—200+6, r= —1504+5663— 486? + 8, 
p’ = —4,556,2500"4 + 52,245,0000!3 — 284,318, 100012 + 1,473,232,3200" 
— 7,531,774,38002° + 28,692,814,22469 — 75,171,819,96068 + 137,243,452,80007 
— 178,312,654,84868 + 166,526,475,26495 — 111,281,505,02404 + 52,015,042,5600% 
— 16,167,313,1526? + 3,002,741,7600 — 252,062,208; 


Vnys(B) ~ (a +1) {(a+4) I — (+ 1)?/a}/T?, 
~ —(alog(1/a))-? (@ = 2approx.), (34d) 
ny (0) ~ —3+(a+1)2{(a+3)(a#+7)I—(a4+ 1)? (a2 + 6e— uN, 


~ (2a?log(1/x))-* (@ = 2approx.), 


2-6 ; ; «du 
where a=-—, I=—(a+1)?—(a+1)%e*Hi(-—a), Hi(—2) -| eu—., 

0-1 ~ U 
It will be fairly evident from (34) that an investigation of the sampling distributions of 
moment estimators such as @, (or higher orders) would be a considerable undertaking. An 
impression of the situation for this example is given in Figs. 1-3.* The gain in using higher- 
order estimators is evident from a glance at the variances in Fig. 1. As for y, and 7», the 
approach of the second and third estimators to the maximum likelihood estimator is 
illustrated in Figs. 2 and 3. The critical value of the parameter is 0 = 2, for in this case the 
integrals appearing in the moments of the maximum likelihood estimator diverge. 

7. Conclusion. Expressions have been given for the first two standardized cumulants 
y, and y, of the sampling distribution of the qth moment estimator. As a special case the 
maximum likelihood estimator is included, the results found agreeing with those of Haldane 
& Smith (1956). 


* T am indebted to Mr A. Fletcher for assisting in the construction of these diagrams. 
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Fig. 1. Asymptotic variance of estimators of 0 in P, = e~*{2—0+(0—1) 2}. 
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EFFICIENT ESTIMATION OF PARAMETERS IN MOVING- 
AVERAGE MODELS 


By J. DURBIN 


Research Techniques Division, London School of Economics 


1. INTRODUCTION 


Although the moving-average process is one of the basic models of time-series analysis, 
efforts at efficient estimation of its parameters have not been very successful owing to the 
intractability of the maximum likelihood equations. In this paper a: simple estimation 
procedure is suggested which in large samples gives estimates whose efficiency is as close 
to unity as is desired. The limiting variance matrix of the estimates is evaluated explicitly in 
terms of the coefficients of the model. The idea underlying the method is to fit a high-order 
autoregressive scheme and to base the estimates of the moving-average parameters on the 
fitted autoregression coefficients. Reference may be made to Wold (1938) and Whittle 
(1951, 1953) for earlier work on the problem. 


2. MAXIMUM LIKELIHOOD APPLIED TO THE FIRST-ORDER MODEL 
Suppose we have a stationary first-order moving-average model 
%=6,+fe_, (¢=1,...,2), (1) 
where {e,} is a series of independent normal variates with zero mean and variance o?, and 
where || < 1. The variance matrix of x,, ...,x, is o°V,, where 


1+? £B 0 ui e 
b 1+ &£ : 
V.=| 0 Bb 1+f 
oe 

| ia ian B 1+) 








Using a result due to Dixon (1944) the determinant of this is |V,| = (1—,?"**)/(1—/”) 
which tends to 1/(1 — 4?) for large n. The inverse of V,, is approximately 








rt =f # .. (-oP™ 

_| -# 1 -s : 

, 2 os : 

al 4 pol 7 

; ~ we 
i(-fy* ... «. —f 1 J 
Thus the likelihood is approximately given by 

log L = constant — } log (1 — f?) as [Dar? — 28D4,2,,1 + 287Da;,H4.—..-]. 


287) 
Since the term }log(1—/?) is of small order in n compared with the remaining part of 
log L we may neglect it to give the approximate maximum likelihood equation 


as ; 
op i. {Lat — 2BLa,x,,4 + 2F7Da,%449— .3| =0 (2) 


1—- 








sis, 
the 
tion 
lose 
ry in 
rder 
the 
ittle 


and 


- ft) 


rt of 




















307 


(cf. Whittle, 1951, equation 7-523). On attempting to perform the differentiation, however, 
we find ourselves with an unmanageable estimating equation. 

On the other hand, the simple estimate obtained by equating the theoretical value of 
the first autocorrelation coefficient p, = //(1+ 4?) to its empirical counterpart 


ry = NUX,X,,/{(n— 1) x27} 


is known to be extremely inefficient. For example, for £ = 4 Whittle (1953) calculated its 
asymptotic variance to be 3-8 times that of the maximum likelihood estimate. Whittle 
accordingly suggests an adjustment procedure intended to bring the value closer to the 
maximum likelihood value but this appears to be rather complicated. 

In the next section a simple but efficient alternative method is suggested. 


J. DURBIN 


3. ESTIMATION BASED ON THE AUTOREGRESSIVE REPRESENTATION 


It is wel] known (see, for example, Wold, 1938) that the model (1) has the infinite auto- 
regressive representation 
Hy + 4M _y+Og%y ot... = & (3) 


where a; = (—/)'. The remainder after k + 1 terms of the series x,+ a, %_,+... is 


(—B)F* (%_p-1 — By_p_ot ++») = (— BYP Gp 


which has variance /?*+2o?, This > 0 rapidly as k > cosince |f| < 1. Consequently the finite 
representation Sts. 1g He (4) 
can be made as accurate as we please by taking k sufficiently large. However, it is important 
to stress that although k is taken to be large, we shall always in asymptotic arguments regard 
it as small compared with n. 

Let a,, ..., a; be the least-squares estimators of «,, ...,%,, i.e. the estimators obtained by 

n 
minimizing S) (%,—a,%_,—...—,%_;)*. From the results of Mann & Wald (1943) we 
t=k+1 

know that a,,...,a);, are asymptotically normal with means q,,...,, and variance matrix 
Vi;1/n, where o°¥, is the variance matrix of x,_,, ...,%_,. Consequently, a, ...,a;, have the 
asymptotic distribution 


nt |¥,\* 


dP = 
‘Oni 





k 
eexp| ~ 5 1+) 2 | (a; —%;)? + 2p = A; — %;) (A414 — auss)}| day. -da,. (5) 


Since «,, ...,%, are autoregression coefficients they satisfy the relations 


Aly + oC, + eee + p04 = —Cy, 
AyCy + HyCy + eee + Oj,Cp_9 = —Cy 
AyCp_y + eee + H;,Co = —Cp, 


where oc, = E(a,%,,,). Putting cy = (1+?) 0, c, = fo* and c, = 0 (r > 1), we obtain 


(1+ A?) a+ fa, —f, 


Ba,_y+(1+f?)a,+fa,.,=0 (r= 2,...,k-1), 
0. 


Boy,_, + (1+ f?) a, 
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Multiplying these equations by — 2a;+a; (i = 1,..., 4) in turn and adding, we get for the 
quadratic expression, Q say, in the exponent of (5), 


k k-1 
Q=(1 +h) & (a;—%;)? + 28 x (4; — 4) (Gi41 — X41) 


k k-1 
= (1+?) = aj +2 = 0; 4444 + 2Ba, — Pa. 
i= i= 
Since, for large k, x, is nearly equal to — # this gives, on putting a, = 1, 


k k-1 
Q = (1+ /?) x a+ 28 = a,4;4,—1, (6) 
to a high degree of accuracy. ; 

We proceed to estimate # by maximizing the likelihood obtained from the distribution 
of a,, ...,,. Asin § 2, |H,| = (1 — A?*+*)/(1 — 4?) which for sufficiently large k is approximately 
equal to 1/(1— 2). This, however, is O(1) whereas the exponent of (5) is O(n). Consequently, 
to a first approximation maximizing the likelihood is equivalent to minimizing the quadratic 
form Q. Differentiating Q with respect to # and equating to zero we have for the estimator 


of £, 





(7) 


k 
x a 
=0 
This estimator is manifestly much simpler to work with than the maximum likelihood 
k-1 k 
estimator obtained from (2). As n increases it converges in probability to — } a;«,,,/ ¥ aj, 
i=0 i=0 


where & = 1, and this may be made as close to / as we please by taking k sufficiently large. 
Since there is one term more in the denominator of (7) than in the numerator there is, per- 
k—-1 k-1 
haps, something to be said for dropping one of these terms, i.e. taking b* = — Ya,a;,,]/ Daj 
i=0 i=0 
as the estimator. This possibility will not, however, be pursued here. A further refinement 
would be to take account of the determinant |V,| in (5). This would give the estimating 
equation B 


n(1 — 2) 


which can be solved iteratively. However, the effect of the extra term is generally negligible 
unless f? is close to one. 


— (faz + Za,a;,,,) = 0 





4. EFFICIENCY OF THE ESTIMATOR 


The minimum asymptotic variance of consistent estimators of # has been evaluated by 
Whittle (1953) as (1 — £?)/n. Whittle also shows that this minimum is attained by the maxi- 
mum likelihood estimator derived from (2). We now demonstrate that the asymptotic 
variance of b can be made as close to the minimum as we please by taking k sufficiently large. 
This will be done by an argument similar to that usually employed for demonstrating the 
efficiency of maximum likelihood estimators (see, e.g. Cramér, 1946, § 33-3). 
For large k we may take the asymptotic distribution of a,, ...,a,, to be 
_ nl —f)-4 


dP (27) exp (— 4nQ) da, ... da,,, (8) 
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r the where Q is given by (6). Let us write this in the form 
) dP = (1—f)-4f(Q) da, 
} and let us write the integral of this with respect to a, ..., a), as 


{ (1—f2)-4f(Q)da = 1, 











| ie [i@)da = 1p. (9) 
(6) } Differentiating (9) and dividing through by n we have 
0Q 1 : 0 1 
ition | [5p(@) de = o(*) » Le. als a= o(;). (10) 
ately A 
ntly, and differentiating again, 
ratic e2Q n/dQ\2 l ; eQ\ n_(2eQ\2 l 
eQ k _k-1 ag k 
Now =~ = 28 ¥ a? +2 Dd aa,,, and —=2 > a? 
(7) op BR 2 bas op? i=0 
Using these results together with (7), we find 
0Q/o8 
hood b—f = -— 
5 as 0°Q/0f" 
a: 
=o E(0Q/2p» 
VHibY =. See 
ange. sien ") = Be /aRP 
fag to the first order in n. Observing that H(0?Q/0f?)? = [H(e?Q/08?)]* to the first order, and 
ya using (11) we have the asymptotic result 
i=0 2 
ment V(b) = - a T° 
ating nE(0?Q/0p?) 
t ; oq & : - ,; 
To the first order in n, Bs) = 2 ¥ «3, which for large k tends to 2 } (—/)”* = 2/(1—?). 
ae i=0 
Thus for sufficiently large k the asymptotic variance of b is 
gible = 
V(b) = i=- (12) 
n 
as closely as we please. 
d by 
naxi- | 5. HIGHER-ORDER PROCESSES 
totic The extension to higher-order processes follows along similar lines. Suppose the model is 
large. 
a xX = + 241+ es + PnG_n: (13) 
where the e’s are as before and where the roots of the equation x" + £,2"-!+ ...4+/,, = Oeach 
} q 1 


have modulus less than one. (13) can be approximated to any required degree of accuracy 
by the finite autoregressive process 
(8) ' 


By + hy Xy_y + eee + Xy,X4_}. = €}. 
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As before the asymptotic distribution of the least-squares estimates a,, ...,a, of a, ...,% 
is, approximately for large k, 


4 | Bit 
n 
dP = (2m)k exp ( = 3nQ) da, eee da,,, 
where, writing a, a for the vectors {a,, ...,a;,}, {a1, ...,%,,} and o?B for the variance matrix 


OF 5, «5% am 


Q = (a—a)’ Bla—a) 
= a’ Ba—2a'’Ba+a’Ba. 


Since « is a vector of autoregression coefficients it satisfies the equation Ba +c = 0, where 
c =-{c,,...,¢,} and oc; = E(x,x,,;). Thus 


Q = a’ Ba+2a'c—2’c, 


h 
where «’c+1+ > #3 is the constant term in the expansion of 
i=1 
h 
(l+a,z+...+0,2*) ¥ o,2'. 
i=—h 

h 
Now Dd ¢,2' = (1+ hyz+...+f,2") (1+ Aye t+... +Byz™) 

-h 
and L+ayzt+...+0,2% = (1+ h,2+...+ 6,2") 


h 
to any required degree of accuracy by taking k sufficiently large. Thus «’c+1+ > f? =1, 
i=1 


that h 
visits Q = a’ Ba+2a’'c+ D> £3. (14) 
i=1 


As for the first-order process log | B| is found to be of small order in n compared with }nQ 
and may therefore be neglected in deriving estimating equations for the /’s. For complete- 
ness, however, the evaluation of |B| was investigated for a second-order process and in the 
Appendix it is shown to have the limiting value 1/[(1—/,)?{(1+,.)?—/3}] for large k. 
Neglecting |B|, on differentiating Q with respect to /,,..., 2, and equating the derivatives 
to zero we obtain the estimators b,, ..., 5, of 7, ..., 8, as the solution of the linear equations 




















k k-1 k-2 k-h+1 Wr.7 rk—1 . 
> ’ 
x a A,Aj., YWaAyo «- eee | X 44544 
i=0 i=0 i=0 i=0 i=0 
k-1 k k—-2 
a! 2 © ~~ 
X 44,4 x 4; : bo} _ | 4442 (15) 
i=0 i=0 = —| i=0 ‘ 
k-h+1 k k—-h 
— y 9 7 
D 44;5n-4 ee yt ‘tit Y a? b, EY 4; 4542 
L i=0 i=0 =~ 4 Li=0 . 
; 1 oo 2 . 
The asymptotic variance matrix of },,...,b, is approximately —Z lear , which to 
n 0p; 0p; 
the first order in n is U/n, where 
=. k-1 k-h+1 al 
— 2 7 
x ai ed ed phi sn—1 
i=0 i=0 i=0 
U-) = 
k—-h+1 k 
.. 
hina ay 
L: i=0 i=0 a 
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Here, a;a;,, 18 the coefficient of 2° in the expansion of 
= (L+ayz+...+0,2*)(L+a,2-1+...+a,2-*), 
which for large k is nearly equal to the coefficient of z* in the expansion of 
(1+ A,2+...+8,2")-1(14+A,274+...+8,2-)41. 
This is equal to the covariance of y, and y,,, in the autoregressive series generated by 


+ PyYat---+Pryn = Sp (16) 
where {€,} has zero mean and unit variance. Thus U~' is the variance matrix of h successive 
observations of the series (16). Its inverse, U, may be obtained by means of the following 
considerations. 

Let Y, denote the (column) vector {y,,...,y,} and Y, the vector {y,,,,---,Ye,}- The un- 
conditional density of Y, ari the conditional density of Y, given Y, are 


: 3 2 1 2h 
Kyexp(-4¥{U%) and Kyexp{-5 it Ayeat--+Aatea!, 


respectively, where K, and £, are suitable constants. Thus the unconditional density of 


Yu, «+> Yor 18 


1 . 2h a 
K,K,exp |-3{¥: UY, + = (H+ Aiyait--- +hrun’]. (17) 


Similarly, the density of Y, and the conditional density of Y, given Y, are, since a stationary 
time series is symmetric with regard to direction along the time axis, 


’ < 
K,exp(—}Y,UY,) and Kyexp|—5 Lut Bitar + + Patan?) 
Thus the unconditional density of ¥,, ..., yo), is 
1{ 2 Ae 
K,K,exp |- 5) 1S (ot Artie +. + Ban)? + Y2U r| . (18) 


(17) and (18) must be identical since they represent the same density. Equating the first 
hrows and columns of the matrices of the quadratic forms in (17) and (18) we have 


Uy, + Bi, Uyet+PriraPy +» UnthiPbr 1 By + Bra 


Uye+PBparPy Use+ Bras t Bin ” By, 1+f3 
UntPhibr sis vee BEF FBR. Buin sco ose 14 $04... 48s 
On subtraction, we find 
r 1—A Bi-BraPr Pr-Brsbr—-~ Bra Pabri] 
Ai-PrsP, «1+ #- PR - A 
U= Bo— BrP A, + BiB2—Bn—2Pni—Pr—-sPr > . (19) 


| “LAB BR- PE PB ahh 
Bra PrBr ‘a Py Brabr 1—Bj 
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Values of U for h = 1, 2,3 are 


h=1 h=2 h=3 
[1 — 9]; 1—~3 oy 4) 1-3 B,—Pe2Ps b2—Pi Bs 
hi-Pih, 1-3 1—hobs 1+ f{-f3-F3 Ab} 
b2-Pibs A — Pas 1—£3 


6. REMARKS ON THE ASSUMPTIONS 


The assumption of normality in the distribution of the e’s in models (1) and (13) was made 
for convenience and can be dropped without much loss. The justification is that Mann & 
Wald (1943) in deriving the asymptotically norma) distribution of ay, ...,a, assume only 
that the e’s are independent and identicaliy distributed variables with a distribution having 
zero mean and finite moments of all orders. On these wider assumptions the estimates } 
and b,,...,5, derived above are still consistent in the limit with limiting variance and 
variance matrix given by (12) and U/n, where U is given by (19), respectively. What we 
lose by adopting the wider assumptions is the assurance that the estimators are efficient. 
However, it is possible that the estimators could be proved to be efficient in the class of 
estimators based on the sample serial correlation coefficients. This possibility ensues from 
the asymptotic normality of the serial correlation coefficients of a moving-average process 
taken together with Bartlett’s (1946) result that first and second :;oments of the sample 
serial correlation coefficients are independent of the parent distribution of the e’s. 

The assumption that |f| < 1 in model (1) is less restrictive then might appear at first 
sight. It is customary to impose the requirement |f| < 1, since otherwise there is an essen- 
tial indeterminacy arising from the fact that we cannot distinguish from the study of ob- 
served values between observations generated by the model with # = f, and observations 
generated by the model with # = 1/f,. The indeterminacy disappears if we follow the 
convention |f| < 1. If |#| = 1, the method proposed breaks down since the autoregressive 
representation does not converge. However, other methods based on series expansions also 
break down, e.g. methods based on the form of the likelihood given in (2). The point is not 
of much practical importance since such cases are hardly likely to arise. Similar remarks 
apply to the assumption for model (13) that the roots of x’ + f,x'-1+...+,, = 0 have 
modulus less than one. 

It was assumed in §3 that the estimates a,,..., a), of a,,...,a, are calculated by least 
squares. Many statisticians will, however, prefer to work with estimates a}, ..., a}, calculated 
from the sample serial correlation coefficients 7,, ...,7;, using the relations 


A+ 1yagt+...+7,14,+7, = 0 
MyAyt+ = gt... + 1p _2AR +72 = 0 


PprUtrpodgt...+ a+r, = 0. 


Since there is no difference in asymptotic behaviour between aj, ...,a;, and the least- 
squares estimates a,,...,a,, the asymptotic theory given above holds for estimates based 
on aj, ..., a}, as well as for estimates based on ay, ..., @;,. 
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7. TESTS OF SIGNIFICANCE AND TESTS OF FIT 


The above results furnish a basis for the construction of large-sample tests of significance. 
Thus to test the hypothesis = £, in (1) we calculate b from (7) and test 


2 = «/n(b— fo) (1— £5) + 
as a standard normal deviate. Similarly, to test the hypothesis £; = fy; (i = 1,...,4) in 
h h 
(13) we test w= n >) ¥ wi(b;—fo,) (b; —Ao;) a8 a x” variable with h degrees of freedom, 
i=1j=1 


where [wu] = U-1 and U is calculated from fp,, ..., 89, using (19). 
A test of goodness-of-fit of model (1) is obtained by noting from (5) and (6) that 


k k-1 
n@ = n\(1+f?) Y ai+22 > ais} 
i=0 i=0 


is approximately distributed on y? with k degrees of freedom. It may be veriiied by sub- 
stitution from (7) that Q can be partitioned in the form 


Q = (1-09) Sat 14 0-AP ah 


Asymptotically, the term n(b— /)? Xa? is equivalent to the regression sum of squares in a 
linear regression model, while the remainder is equivalent to the residual sum of squares. 
Thus the goodness-of-fit of the model may be examined by testing 


1 ; 
n = n{(1—04) 3 a1} (20) 
t i=0 2 i 
as a x" variable with k—1 degrees of freedom. 
The test for the general model (13) is obtained by recalling that b,, ..., 6, are obtained by 
minimizing the expression Q defined by (14) with respect to /,, ...,£,. Substituting b,, ..., b, 
for f,,..., 2, in (14) and using the relations (15) we find for the minimum value of Q, 


h 


a 


k k-j 
Q = > ai+ 8; X 4ide45— 1. (21) 
t= j i= 


The goodness-of-fit of the model is tested by treating nQ as a y” variable with k—h degrees 
of freedom. 
This test of fit is clearly a good deal simpler to use than that suggested by Wold (1949). 


8. SOME NUMERICAL RESULTS 


First-order moving-averages have been fitted to twenty series each containing 100 obser- 
vations generated by the model 
ay = E+ 364. 

The {e,} were pseudo-random normal deviates calculated on the English Electric DEUCE 
calculator. Autoregressive models of order five were fitted to each series. Denoting the 
resulting coefficients by aj, ...,a; the moving-average parameter was then estimated by the 
formula , 


Sin eo 
l+aj+...taz © 


Biom. 46 
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For each series the simple estimator c obtained from the first serial correlation 7, was also 
calculated. This is given by the formula 


e/(1+c*?) = 7, 


taking the root having modulus less than one. Where the roots were imaginary c was taken 
to be one. The results are given in Table 1. 























Table 1 
| 
Series b c Series | b c 
| 
| are % || ———_—_——_—___—__—— 
1 0-4302 0:4085 1] 0-5199 0-5916 
2 3639 -4966 12 5304 | -4418 
3 3851 -3509 13 5278 | +4467 
4 4751 4579 14 4856 | +5424 
5 4298 | -3863 15 3141 | +3524 
6 05466 | 00-6665 16 | 03380 | 02670 
7 5326 | 6106 3 4005 4648 
8 4156 -6212 18 | -5528 1-0000 
9 -4926 1-0000 19 +3993 0-3649 
10 4142 0-2839 20 +5069 +4984 
wile ieee! 
The sample means and variances are: 
b c 

Mean 0-4531 0-5126 

Variance 0-00536 0-0398 

S.E. of mean 0-0164 0-0446 


The claim for the efficiency of b is well supported by these results. The observed variance, 
namely 0-00536, is actually less than the theoretical value of ;45{1—(1/2?)} = 0-0075, and 
is substantially less than the estimated variance of c. A further point in favour of 6 is that 
it is closer to the true value of } for 16 out of the 20 samples. 

On the other hand, it is disappointing that the results for b show such a strong downward 
bias. The discrepancy between the observei mean of 0-4531 and the true value of } is well 
over twice the estimated standard error. The following remedial measures were tried, but 
in no case was the improvement satisfactory enough to warrant inclusion of the results: 
the order of the fitted autoregressive scheme was raised from five to ten; a term was dropped 
from the denominator of b in the manner referred to at the end of § 3; the term — } log (1—/) 
in the likelihood function was allowed for as described at the end of § 3; Quenouille’s method 
of bias reduction was used (see Durbin, 1959, for details of the method). This matter of 
bias clearly calls for further consideration. 


I am indebted to Professor M. G. Kendall for some valuable corrections to the first draft 
of the paper and to Miss J. Grahame and Miss J. May for assistance with the computations. 
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APPENDIX 
Evaluation of the limiting covariance determinant of the second-order moving-average process 
Let D,, be the nth-order determinant 


| @ b c 0 0 0 
|b a b e¢ 0 : 
c be a@ bee 
| 0 ¢ b a b 
: a 6b}. 
0 ove eco ooo coe 0 Cc b ai 





We proceed to calculate lim D,, for the case where a = 1+ £{+/3, 6 = 2, +f, fo, ¢ = By 
Expanding by the hub ion we obtain after some reduction the fifth-order difference equation 
{-—1+(a—c)z+(ac—b?) 22 +.¢(b?—ac) z* + c3(c—a) z4 +c5z5} D, = 0, (Al) 
where z’D,, denotes D,_,. On substitution for a, b, c we find that (A 1) factorizes in the form 


(z—1) (B,2— 1) (B3z— 1) {832* + (28,—fi)z+ UD, = 0. 


Let {fiz? + (28,—87)z+ 1} D, = C,. (A 2) 
Then C,, satisfies the difference equation 
(z—1)(A,z—1) (f2z—1)C, = 0. (A3) 


On examination it turns out that equation (A 1) is satisfied for n > 2 provided we take D, = 1, 
D_, = 0, D_, = 0. Substituting in (A 2) we have 


C,=1, 
0, = (1+/,)*, (A 4) 
2 = (1+f2)(1+ 2)? + Ae. 
The general solution of (A 3) is Cy = A, +Af2 +Ash3"- 


When specifying the moving-average model we stipulated that 2?+/,2+/, = 0 has roots with 


modulus less than one. It follows that |f,| < 1. Consequently lim C,, = A,. Substituting in (A 4) we 
n> @ 


find A, = (1—£,)-?. 
From (A 2) we have that 


{f3+(28,—f3) +1} lim D, = lim C,. 
n->@o 


n> @ 
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1 
Consequently, lim D, = - A5 
iniad no" (Ay) (1+hP—AD = 
r 
It is remarkable that this limiting value is the exact reciprocal of the very similar determinant ' 
1 By p2 0 0 Rae 0 
A, 1+fi B(1+f3) A, 0 
Bz B1+h2) 1+fi+fh2 B(1+Ay) : 
Hn=| 0 = By BL +A) 1+fi+hi | ae 
: 1+fi fp, 
0 oe = aka ect oe 1 








which differs from D,, only in the first and last two rows and columns. H.,, is the determinant of the 
reciprocal of the variance matrix of n successive observations of the autoregressive series 


%+ By %1 + Be -2 = Ey } 


where ¢, has unit variance. Thus the limiting covariance determinant of n successive values of a secon4- 
order moving-average series is the same as the covariance determinant of n observations of a second- 
order autoregressive series with the same coefficients and the same residual variance. It is likely that 
this result is generally true and it would be interesting to see a general proof of it. 
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THE COMPLETE AMALGAMATION INTO BLOCKS, BY WEIGHTED 
MEANS, OF A FINITE SET OF REAL NUMBERS 


By R. E. MILES 
Statistical Laboratory, University of Cambridge 


1. INTRODUCTION AND SUMMARY 


The problem discussed below arose in Bartholomew (1959a), in which is proposed a test of 
homogeneity of ‘ordered’ alternatives. 

On p. 40 of this paper Bartholomew defined the probability P(/,k; a,, a, ...,a;,) and 
proceeded to determine its values for 1 <1 < k < 4 and general a; (i = 1,...,k). 

In the special, but important, case of equal weights (a, = a, = ... = @,) it may readily 
be verified that Pil, k; ay, Gy, ..., 4,) is independent of the common value of the a,, and thus 
may be written P(l,k), say; this notation departs slightly from Bartholomew’s in the use 
of a circumflex, but this is done to avoid confusion in the Appendix. Bartholomew deter- 
mined values of P(l,k) for 1 <1 < k < 5 and, on the basis of these results, conjectured on 
p. 43 the recurrence relations 


> 
~ 


Pky => E Ba,k-jPl-1,j) @<t<b, (*) 
j=l-1 
where Pak) = ;: “ge 


We now make a brief survey of the present paper. 


Summary 


No knowledge is pre-supposed of Bartholomew (1959a), and the notation and ter- 
minology differ to some extent. In § 2 an ‘n-collection’ [n] is defined as a set of n real num- 
bers, called ‘ordinates’, and n positive real numbers, called ‘weights’,+} satisfying a certain 
condition @. It is then possible to define a ‘random array’ of [n]. The ‘complete amalgama- 
tion’, by the amalgamation process .%,, of an array into ‘blocks’ by decreasing weighted 
means—a generalization of Bartholomew’s amalgamation process—is described in §3; 
with the aid of an equivalent amalgamation process .%7,, the complete amalgamation into 
blocks by .~, is proved to be unique (Theorem 1). In §4, #,;(k,”) is defined as the prob- 
ability that a random array of [n] yields k blocks on complete amalgamation. There follows 
the interesting main result of the paper (Theorem 2), viz. F,,;(k,n) = P(k,n), independent 
of [n], and recurrence relations (10)—(13) are given which determine the P(k,n); in the 
Corollary to Theorem 2 the ordinates are taken as a sample from a continuous probability 
distribution, and this provides the final link with Bartholomew (1959a). The generating 
functions of P(k,n) are determined in §5 and, making use of one of these expressions, we 
obtain in’§ 6 the formula 1 
P(k,n) = = |S*|, 


+ It should be noted that the ‘weights’ we define have a different meaning from those used by 
Bartholomew. 
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where S* are the Stirling Numbers of the First Kind. General and limiting properties, and 
a table of values, of P(k, n) are given in §7. Finally, in § 8 two related problems are discussed 
which both yield an identical probability distribution. 
The conjecture (*)-(**) is in fact true. This is proved in the Appendix where we obtain 
the result i. 1 
P(k,n) = P(k,n) = - | SK]. 


On account of this, Bartholomew’s test in the case of equal weights a; may be completely 
specified. Further details arising from this development are included in a second paper— 
Bartholomew (19595). 


2. DEFINITIONS AND NOTATION 


We consider a set of m real numbers X,,..., X,—the ‘ordinates’, and a set of n positive 
real numbers W,, ..., W,—the ‘weights’. 
We define a condition @ on the weighted means of the ordinates with respect to the 


weights as follows: 
Wad, +... +0, | Wyo, + + Ft 





E is! .. ane = Pr+1” Prt. Pr+s”Pr+s 
Wit + W,, Wort + + Whe, 
for any two (r+s)-permutations p(p, ...P,+s), P’'(P1---Py4s) Of the integers 1, ...,n, where 


l<r,s<n—land2<r+s<n. 

Imposing @ in what follows enables us to use strict inequalities (>, <) instead of weak 
ones (>, <). In particular, taking r = s = 1, we see that @ requires that X; + X, if i +), 
i.e. the X, are n different real numbers; however, @ places no corresponding restriction 
on the W,. 

We define an ‘n-collection’ as a set of n ordinates and n weights satisfying @, and denote 
it by [n]. By an ‘n-array’ is meant a 2 x n matrix whose elements constitute an n-collection 
—the first row being the ordinates, and the second row the weights, in some order. [] thus 
gives rise to (n!)* different n-arrays—the ‘arrays of [n]’. By an r-subarray of an n-array 
(1 <r <n) is meant an r-array obtained by deleting columns of the n-array so that only 
r adjacent columns remain. @ holds a fortiori for the elements of subarrays. (Dropping the 
prefix, the terms collection, array and subarray have obvious meanings.) 

We select independently, at random, two permutations o(¢,...0,), o'(04...0,) of the 
integers 1, ...,2. There are (n!)? ways in which we can do this, all equally probable, and so 
each ordered permutation-pair (7, 0’) has probability 1/(n!)? of being selected. 


We call ee ee 
S*% = 1 n 
as = (ie) 


the n-array corresponding to our random choice, a ‘random array’ (R.A.) of [7]. 
We write 
Zi = Wy, Xo. Wai, Xing te + Mj Xo; 
W5 = Was Woi,.t +--+ Woy 


i 
Xj= | (a ‘weighted mean’), 
j 


and Yfas the 2 x 1 matrix ( 


and that we may write 82-5 = (Yj... Y2). (1) 











nd 


she 


ere 


sak 
+ j, 
ion 


ote 
ion 
hus 
ray 
nly 
the 


the 
1 so 


ray, 











R. E. Mies 


3. THE COMPLETE AMALGAMATION OF AN ARRAY 


S77, is taken as the array. 

The amalgamation process .J,. We operate stage by stage on S¥:7, as follows. In each 
stage, having started afresh with an r-array (2 < r < n), either the process terminates or 
we finish with a new (r—1)-array. Thus, after the (n —r)th stage, we have the r-array 


SiF = (Yuet... Ye-+), where r= 0,1, =n. (2) 


If X71 > ... > Xf-*+1 we proceed no further, the process being said to have terminated. 
Otherwise there are one or more 2-subarrays ( Y7i-»+1 Yri+1), where 1 < i < r-—-1,in 8%% such 
that Xti--+1 < Xti+}. In this case we choose quite freely any one of these subarrays and 
replace it by the l-subarray Yfi-:+1, thereby obtaining the (r—1)-array S%%_,. We thus 
obtain successively the arrays S77", S%%_1, ..., Se the process terminating after n—k 
stages, where 1 < k < n; the final k-array S% is such that Xj, > ... > X¥#—*1. It should 
be noted that the process automatically terminates if S%;{ is reached. 

We observe that S%;7"_, has the same basic form as S%7’, as written on the right-hand side 
of (2); by (1), 8%, also has this property. Hence, by induction, so do all the 8% (k < 8 < n). 
This ensures that the process is a valid one and may always be carried out as described. 

We say that the r.a. S%% of [n] has yielded k ‘blocks’ Yj, ..., Y¥#-:+1 in this ‘complete 
amalgamation’ (complete amal.) by ./,. 

It is seen immediately, from the nature of .~,, that S%%_,, ..., 87%, are not, in general, 
unique. 

However, we now prove that k and S%% are unique. This we do by first defining an alter- 
native amalgamation process .~, which gives a unique complete amal. Then we demonstrate 
(Theorem 1) the exact equivalence of a complete amal. by .7, with the complete amal. by %,. 


The amalgamation process .%,. We take as first block Y/},, where Xi = max. X}; Y}+ as 
Tal, .00,8% 


second block, where X}+1= max. X}*+1!; and so on. The process terminates after | 
1=(,+1),....0 


steps, where 1 </ < n, giving the complete amal. of 8%;7 into 1 blocks by .%,. 


THEOREM |. Any complete amal. of an array by 7, is identical with its unique complete 
amal. by 7). 
Proof. We first show, as a consequence of Xj, being max. X}, that 
l=i,.. 


5 eosg lt 
Xi > Xi, > X}H (3) 
for all i, 7 such that 2 <i <1, and/,+1 <j <n. The first inequality is excluded if 1, = 1. 
For suppose the first part of (3) did not hold. Then 
Zi < Wi Xt. (4) 
By definition Zi_y < Wi Xj. (5) 
Adding (4) and (5), we obtain Zi, < Wj, Xj, or Xj, < Xj—a contradiction. 
Now suppose the second part of (3) did not hold. Then 
Wyt* Xi < Zt}, (6) 
By definition Wi,Xi, = Z- (7) 
Adding (6) and (7), we obtain W}X}, < Z} or Xj, < X}—another contradiction. 
The proof of (3) is thus completed. 
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It immediately follows that no amalgamation is possible in .7, between any ordinate of 
the type Xf, with one of the type X}*+1. Thus 07, is at least as ‘fine’ as /,. Hence it now 
remains only to prove conversely that .7, is at least as fine as .7,, i.e. that in a complete 
amal. by «&, the l,-array (Yj... Y}:) cannot be separated into more than one block—into 
t blocks, say, where t > 2. 

For, suppose this were possible, these blocks being Yt, ..., Y{-:+?. Then 


X¢-+1 > X} (8) 
by the first inequality of (3). Also, since this is a complete amal. by .~A,, 
Be > oe (9) 


(8) and (9) together imply that Xj, > Xj, a contradiction. 

Hence the first block in all complete amals. by ~, is the same as the first block in the 
complete amal. by .°,. We may now similarly prove that the second block has the same 
property. And so on. 

We may define an amalgamation process ., which is the ‘reverse’ of 75. 


The amalgamation process ,. We take as last block Y7’"-*+1, where X7"-t1 = min X?; 
m=1,....2 
Ywm-+1 as penultimate block, where X7im--+1 = min. X*%,,_,; and so on. This process 


™mm—1? 
M=1,...;Mm—1 


terminates after m steps, where 1 < m < n, giving the complete amal. of S%;% into m 
blocks by 73. 

We may prove the theorem resulting from the replacement of 7, by ~, in the statement 
of Theorem 1 by a similar method. 

Thus we may now speak without ambiguity of the ‘complete amal. of an array into blocks’, 
without specifying which of the processes .~7,, ., or , has been employed. 


4. THE PROBABILITIES P(k,n) 
Let * be some condition we may impose on a B.A. of [n]. We define 


P,,(k, n | *) = prob. (R.A. of [n] yields k blocks | R.A. satisfies *) 


_ no. of R.A.’s of [n] satisfying * and yielding k blocks 
vl no. of R.A.’s of [n] satisfying * 





and F,,,(k, ) as the corresponding prob. with no condition on the R.A.’s. 
We notice that F,,;(k, n | *), F,)(&, n) are only defined for integers k, n such that 1 < k < n. 


THEOREM 2. F,,(k,n) is independent of the n-collection [n], and thus may be written 
P(k,n). The P(k,n) are determined by 


P(i,1)=1 (10) 
and the three recurrence relations (for n > 2) 
P(i,n) ="—* P(i,n—1), (11) 
P(k,n) = ~ [P(k— 1,n—1)4+(n—-1)P(k,n—1)) (2<k <n-D), (12) 
and P(n,n) =< P(n—1,n—1). (13) 


Proof. (10) is immediate. 
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To complete the proof of Theorem 2, we proceed by induction. Let A, denote the pro- 
position {F,,;(k, r) is independent of [r] for 1 < k < r, and thus equals P(k, r)}. Let us suppose 
that ,,_, is true. 

Let X(, = max. X;. X;, occurs, with equal probability 1/n, in each of the ‘positions’ 


=i, ...,8 
(synonymous with columns of a matrix) 1, ...,n of a R.A. of [n]. 
We subdivide the arrays of [n] into ‘classes’ «,,, £},,, such that each of the (n!)? arrays of [n] 
belongs to one, and only one, class. We say a R.A. belongs to a, {i’ = 1,..., n} ifthe 1-subarray 
(7°) occursin position 1 of the array, and to f4,,-{j = 1, ...,(6)— 1,(6) + I, ...,%39',9' = 1,.00,8, 


where i’ + j’}if the 2-subarray ha “ occursin any of the positions (1,2), (2, 3),...,(m—1, n) 
| ili 


of the array. (We emphasize that a subarray of an array contains adjacent columns of the 
array.) 

It is easily seen that the above classes are mutually exclusive, and together contain all the 
(n!)? arrays of [i]. 

The classes «, # have this same property if we write 


n 
a= ya, and P= YUY By. 
v=1 all admissible 
. . . . j, j 5 v 
We now consider these classes individually. 


a, In a complete amal. the 1-subarray my will obviously be the first block. If in each 


member of «,;, we delete this subarray from position 1, we obtain every one of the arrays of 
the (n — 1)-collection [n — 1], obtained from [n] by omitting the ordinate X;, and the weight 
W,. Hence, if an array of [n — 1], yields k—1 blocks, then the corresponding member of «,, 
yields k blocks; each member of «;, thus yields at least two blocks. But our inductive 
hypothesis applies to [n — 1],.. 

Hence, for n > 2, 


Pil, n| B.A. € a) = 0 (14) 
and Pialk, n| R.A. € Gy) = Pry_yk—-1,n—-1) 
= P(k-—1,n—1) (2<k<n). (15) 
But Pi(k, n| R.A. € &) = 5 Pik, n| B.A. € a) prob. (R.A. € &| R.A. € @). 
7=1 


Hence, using (14) and (15), we have for n > 2 


Py(1,n| R.A. € a) = 0 (16) 
and Pyilk,n| R.A. € &) = P(k—1,n-1) 5 prob. (R.A. € &;| R.A. € &) 
v=1 
=P(k-l1,n—-1) (2<k<n). (17) 


Wy We 


. Thus we cannot obtain n blocks in a complete amal. Moreover, 


4. Since X; < Xq, we may by ~, amalgamate the 2-subarray ( to form the 


Wy X;+ Wy Xe 
W,+W 
operating in this way on all the members of /3,,,, we obtain every one of the arrays of the 


1-subarray ( 
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(n—1)-collection [n—1]},,, obtained from [x] by substituting (W,X,+W,X()) for X; and 
Xp, and (W, + W,) for W, and W,, respectively. Hence, if an array of [n — 1]},;, yields k blocks, 
then the corresponding member of /3,,, also yields k blocks. But our inductive hypothesis 
applies to [n—1]},,. 

Hence, for n > 2, 


F,,\(k, n | R.A. € fi) = Fin-w,(k, n—1) 


=P(k,n-1) (l<k<n-l) (18) 
and B,(n,n| B.A. € fi) = 0. (19) 
But 
Plkn|RA€f)= YDYY A,lk,n|R.a. € fj) prob. (R.A. € fj, | B.A. € £). 
all ——e 


Hence, using (18) and (19), we have for n > 2 


Prlk,n|R.a. € 8) = P(k,n—1) SSD prob. (R.A. € f},,| B.A. € f) 


all admissible 


5008 =P(k,n-1) (l<k<n-}), (20) 
and Pi, (n,n | B.A. € £) = 0. (21) 
Now 
Pk. n) = By, (k,n| B.A. € x) prob. (R.A. € a) +A, (k, n| B.A. € £) prob. (R.A. € f). 
1 n—1 
But prob. (R.A. € a) = : and prob. (R.A. € #) = ma 
Hence, using (16), (17), (20) and (21), we have 
Rf1.0) = 6.-4P0.e—_. (22) 
n n 
Balk, n) = P(k—1,n— 1) + P(k,n— 2 (2<k<n-}), (23) 
1 n—1 
and F,(n,n) = P(n—1,n—1)° +0.——. (24) 


Thus we see from (22)-(24) that the truth of F,,_, implies the truth of ,. But P, is 
true. Hence, by induction, F, is true for all positive integers r, and so we may delete the 
suffix [n] in (22)-(24), thus giving us the recurrence relations (11)—(13). 

Corotuary. All the above theory, including Theorems 1 and 2, applies (with obvious 
appropriate modification) to the amalgamation of the array S,, = (* sie ri , where 2; is the 
ith member of an ordered, i.e. successive, sample of size n from any continuous probability 
distribution. 

Proof. This is because 

(a) for ordinates X; chosen as a finite sample from a continuous probability distribution, 
and weights W, chosen as any set of positive real numbers (and hence, in particular, as 
W, =... = W, = 1), the condition @ holds with probability 1; and (b) if we are given the 
members of the sample, but not their order of occurrence, then any one of their n! possible 
orderings is equally probable to occur. 
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This work was actuated by an attempt to prove the conjectured recurrence relations 
(11)-(13) of Theorem 2 for the array S,, of the Corollary. It should be observed that the 
induction argument cannot be carried through using unit weights, the use of ‘generalized’ 
weights being essential. In fact, to prove the Corollary, it would have sufficed in the fore- 
going theory to take the weights W, as any set of positive integers which, together with the 


same ordinates, satisfied 7. 


5. THE GENERATING FUNCTIONS ASSOCIATED WITH THE P(k,7) 


As usual, the operations below are merely formal, i.e. they give results which may after 


wards be verified; see, for example, Riordan (1958, pp. 19-20). 
We define the generating functions F,(z), @,,(¢), and H(z, t) as follows: 
F(z) = ¥ P(k, n)z", 


n=k 


G,,(t) = ¥ Plk,n)t* 
k=1 


and A(z,t) = > s P(k, n) t#z” 


n=1k=1 


Multiplying (11) by tz”, (12) by t*z”, (13) by #"2", and summing we obtain 
om 4 
dz lz 
the solution of which is H = C(1—z)~, where C is a constant. 
Expanding the right-hand side as a binomial series, 


mm oft seston, 
n! | 
But 3 Prk, n) = 1, or G,,(1) = 1. 
k=1 
Thus, by (26)-(28), C = 1, H(z,t) = (l1—z)", 
and G,, (t) = See? ss 
n! 
By (29) H = exp |! log (;-)| , 


Expanding the right-hand side, this time as an exponential series, 


H = 1+tlog{1/(1—z)} +... ~ao vane 


Comparing (25) and (31), it is seen that 


F,(z) = ae oar 


(30) and (32) enable us to determine P(k, n) for fixed n and k, respectively. 


(25) 


(26) 


(27) 
(28) 
(29) 


(30) 


(31) 


(32) 





Complete amalgamation into blocks 


6. THE RELATIONSHIP OF THE P(k,n) TO THE STIRLING 
NUMBERS OF THE First Krivnp, S* 


The S* are defined by 3S Skt = t(t-1)...(t—-n+)), 
k=1 


from which it follows, by substituting —¢ for t, that 


> |SE| = t(t+1)...(t+n—1). 





k=1 
But, by (30) > P(k, nyt = +) SS 2 ) 
k=1 ! 
ux 
Hence P(k, n) a — |Sal- 


Since Stirling Numbers are treated rather exhaustively in Jordan (1947, Ch. IV) we shall 
content ourselves with noting a few of their more important properties in connexion with 
the P(k, n), giving references where necessary. 


7. PROPERTIES OF THE P(k,n) 


The general solution of equations (10)—(13) is unknown (Jordan, p. 143), but in some cases 
they can be solved easily; 





1 
P(1,n) =— 
(1,n) ==, 
1 
P(n,n) = nl’ 
(as is otherwise obvious, since there is only one permutation o such that X, > ... > X,,), 
: 
P(n—1,n) = Xn—2!” 
1 1 
P(n—2,n) = 


8(n—4)!* 3(m—3)!” 
and so on (Jordan, p. 149). 
We give a table of n! P(k,n) for 1 < k < n < 12 (Jordan, p. 144). 
. 


The mean pu and the variance o* of the number of blocks yielded 


w= SkP(k,n) = G4(1). 
k=1 








But log G,,(t) = logt+...+log (¢+n— 1) —log (n!). 

P Gr(t)_ 1 1 

[Thus Sa i* tact (33) 
Using (28) and (33) w=14+4+... +2. (34) 


The ‘mean length of each block’, / = > (n/k)kP(k, n)| > kP(k,n) = n/p. From (33) 
k=1 k=1 


Gn(t)G5(t)—(G,@)P _ 1 





(4, (t) a (¢+n—1)* 
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Hence o* = G(1)+G@,(1)-[4, (DP 
1 1 1 
=(14d4+..45)-(14 pt -43), (35) 
Table of n! P(k,n) forl<k<n< 12 
| PEs tf elena anh a lel 
| | 
5 | 6 7 ak a 10 ll | 12 | 
| | 
| | | | ee; 
120 | 720 5,040 | 40,320 | 362,880 | 3,628,800 | 39,916,800 | 
274 | 1,764 | 13,068 | 109,584 | 1,026,576 | 10,628,640 | 120,543,840 | 
225 | 1,624 | 13,132 | 118,124 | 1,172,700 | 12,753,576 | 150,917,976 | 
85 | 735 | 6,769 67,284 | 723,680 8,409,500 | 105,258,076 | 
wae 15 | 175) 1,960 22,449 | 269,325 | 3,416,930 | 45,995,730 | 
bet] | 1 21| 322) 4,536 | 63.273, 902,055 | 13,339,535 | 
ye {=| | 1 | 28 | 546 | 9,450 | 157,773 | 2,637,558 
8)-|-]-| | | 1 | 36 | 870 | 18,150 357,423 
| 9 | sf dey | | 1 | 45 1,320 32,670 
lio |. /.]. | | 1 | 55 1,925 
“Seer | 1 66 














Asymptotic properties, as n > 00 


From equation (5) of Jordan, p. 160, we obtain, for fixed k, 
Ln) ~ Hos 
i ae n(k—1)!° 


Thus lim P(k,n) = 0, uniformly in k. Let 


n>o 
P(kmax.,2) = max. P(k,n). 
k=1,....7 
Then, for large n, we have approximately (Jordan, p. 161) 


kmax. > logn > kmax. — 1. 


By (34) b~ logn. 
Hence l~ jo 


Thus both « and 1 -> oo; however, 
es. ee 
Bh [logn]}? : 


By (35), 0? ~ log n, also. 


As we might expect, a central limit theorem applies, viz. prob. (no. of blocks yielded by 
a R.A. of [n] lies between logn+a,/logn and logn+b.,/logn, a < b) ~ ®(b)— (a). For a 


justification of this, see Feller (1957, pp. 242-3), where problem (a) of § 8 is discussed. 
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8. Two RELATED PROBLEMS CONCERNING RANDOM PERMUTATIONS 
(a) The following problem is treated in Riordan (1958, pp. 66-72) and in Feller (1957, 
pp. 242-3). 


‘What is the probability P’(k, n) that a random permutation 7 (. ) yields k cycles?’ 
— * 


Surprisingly, it turns out that we may derive the equations (10)—(13) for P’(k,n) in a 
similar, but more elementary, way and so P’(k,n) = P(k,n). 

(b) The second analogous problem may be stated thus: 

‘Let o(¢,...0,) be a random permutation of the integers 1,...,n. Take as first 
“block” (o, est where @;, a es O;(= 0); (0j,41--- Oj.) a8 second “block’’, where 


o;,= max _ 4;; and so on. What is is the probability P’(k,n) that this amalgamation 
J=C:14+1),....n 


process yields k blocks in all?’ 
By a simple argument (in which the position of n is considered) we may obtain the 
recurrence relations 


Pin) =1 
n 
n-1 


y P(k-1,s) (2<k 
=k-—1 


and P"(k,n) = 


MN 
& 


€ 
Ns 
It is easily verified that H”(z,t) = H(z,t), where 


A" (z,t) = >e: P"(k,n)tkz", andso P"(k,n) = P(k,n). 


ih» 


This result may alternatively be derived as a corollary of the above theory, as follows. 
Select an n-collection [n],g of ordinates X;, with X,, > ... > X,, and weights 1, ..., 1 which 
satisfy the additional condition: 

9: X,+...4¢X,> of <X,+...4X 


Pr+s 


according as max p,j> or < max /),, 
i=l, ..7 i=1,...,(r+s) 


for any (r+8)-permutation p (p,...p,,,) of the integers 1, ...,n, where 1 < r,s < n—1 and 
2<r+s<n 

We may use a step-by-step construction to verify that such sets of ordinates exist. With 
the n weights 1, ..., 1 select aed the ordinates X, and X,, with X, > X,; X, and X, 
satisfy @ (with welghite 1,1) and ZY. Now choose X53, > X,, ual sufficiently large that 


X,, X, and X, satisfy @ (with weights 1,1,1) and J. Likewise choose successively 
sé. x, 


Now consider more closely the complete amal. by .%, of the R.A. oe wi 7 of [n],. 


The first block will be Yj,, where Xj,= max Xj, But, by 9, X;,= max X,,(=X,) 
j=1, ...,% 5 =1 n 
Similarly, the second block will be Y} Atl , where X;, max X,j3 and so on. 


~ §aGstD), ons 
It is now seen that, for the same random permutation o, the amalgamation process in 


the two problems is identical, under the correspondence 0; ++ X,,. Hence P’(k,n) = P(k,n). 
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The problems (a) and (6) lie primarily in the realm of combinatorial analysis rather than 
probability theory (as perhaps also does the rest of the work). 


I am indebted to both Dr D. J. Bartholomew, for originating the problem, and Mr D. V. 
Lindley, for bringing it to my notice and giving helpful advice. 
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APPENDIX 


A 
We are now in a position to prove the truth of Bartholomew’s conjecture (*)-(**) for P(l, k) referred to 
in § 1. First, applying the Corollary to Theorem 2 with the probability distribution taken to be normal, it 


is seen that Pu, k) = P(l,k), where Pu, k) is as defined in § 1. Thus 
A 1 . 
P(l,k) = P(l,k) = z |Sk|- 
To complete the proof, we prove that the conjectured recurrence relations 
A 1] k-1l A A 
P(l,k) = + x P(l,k-j)P(l-1,j) (2<1l<h), (*) 
j=l-1 
A 1 
where P(1,k)= R’ ese, 
yield the same explicit formula for Pu, k) as above, hence thereby ensuring their own validity. Using 


(**), substitute for Po, k—j) in (*). Multiply the resulting equation by z* and sum from k = 1 to o 
Writing 


Pyz) = = Pu, k) 2%, 


this gives Fy) = 7 {ef (2) + joF_, (z)+.. +f 1 UP, (z)+.. | 
Hog (2) # 
= jloe (=) Fat. 
1 1 
Also, by (**), Pz) = a+ d2i+...+ 2+... = tog (+). 


From the last two equations, we have 
flog {1 = —2)i] 
Pye z)= a — = Fz), 
A 
by (32). Hence P(l,k) = P(l, k) for all l, k. 

Thus, while not as simple as (10)—(13), (*)—(**) are indeed exactly equivalent to them. It does not seem 
possible to prove (*) directly despite the heuristic interpretation it may be given. However, (*) is but 
one of a number of recurrence relations satisfied by the P(l, k). We have one in § 8(b) for P’(l,k) = P(l,k), 
and yet more may be found in Jordan (1947, Ch. IV), by using the relation |Sj| = k! P(l, k). 


Note added by author in proof. I am grateful to Dr H. D. Brunk of the University of Missouri for 
the information that this type of problem has been considered before by E. 8. Andersen (Math. 
Scand. (1954), 2, §§7, 8, pp. 209-18). He derives ((8.31), p. 216) my recurrence relations (10)-(13), 
using an entirely different, but less direct, method. Apart from this, there is little or no overlap 
between Andersen’s paper and mine. 
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A TEST OF HOMOGENEITY FOR ORDERED ALTERNATIVES. II 


By D. J. BARTHOLOMEW 
University College of North Staffordshire 


1. INTRODUCTION AND SUMMARY 


In an earlier paper (Bartholomew, 1959) referred to subsequently as I, modified y?- and 
F-tests were introduced for use against ordered alternatives. The need for them arises in 
the following way. Suppose that we have a sample of k independent values 2, %5, ..., x, 
where 2; is normally distributed with mean m,; and standard deviation o, (7 = 1, 2, ..., k). 
A test is required of the hypothesis m, = m, = ... = m,,(H) against the ordered alternative 
m, > Mz > ... > m,.(H,). The modified tests, denoted by x? and F, are calculated after the 
original values have been ‘reduced’ by an averaging process which depends upon their 
order and magnitude. 

The distribution of these statistics was completely determined for k = 3 and k = 4 and 
tables of percentage points were given for ¥?. The distribution depends upon certain pro- 
babilities, P(l,k). In the general case, when the o’s are not equal, their determination for 
k > 4 was not possible because of lack of knowledge about the normal multivariate integral. 
For the special case of equal standard deviations, however, a recurrence relation was 
conjectured. Miles (1959) has proved this conjecture and has further shown that 
P(l,k) = |S},|/k!, where Sj, is the Stirling number of the first kind. This result has made it 
desirable to extend the tables of percentage points and to give some further properties of 
the distribution. This work occupies § 2 of the present paper. 

It was pointed out in I that a test was required for use when the direction of the ordering 
was unknown; that is, for the alternative 


Hy: my 2>mM_,>...>M,, OF M, > My_1 2... > M. 


Such a test is provided in §3. It is a two-sided version of X? (or F) and requires no new 
distribution theory. Some extra significance levels have been given for k = 3 and k = 4 
to facilitate the use of this version of the test. 

A good approximation to the distribution in the general case, for k = 5, has been obtained; 
this is given in § 4. 


2. THE DISTRIBUTION OF X? FOR EQUAL WEIGHTS 


2:1. The distribution of ¥? when o, = o, = ... = 0, is given by 
- 1 ” 
Prix? > 7} = jE |Sk | “pada 
*l=2 Y (1) 
Pr {3 = 0} = 1/k. 


A table of Stirling numbers will be found in Miles’s paper. The 10, 5, 2-5, 1 and 0-5% 
points of ¥? calculated from (1) are given in Table 1. 
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100P 


| 
| 
10 5 2-5 1 0-5 
k 
3 2-580 3-820 5-098 6-822 | 8-146 | 
4 3-187 4-528 5-891 | 7-709 9-092 
5 3-636 5-049 6-471 8-356 | 9-784 
6 3-994 5-460 6-928 8-865 10-327 
7 4-289 5-800 | 7-304 9-284 10-774 
8 4-542 6-088 7-624 9-639 | 11-153 
9 4-761 6-339 7-901 | 9-946 11-480 | 
10 4-956 6-560 8-145 10-216 | 11-767 
11 5-130 6-758 8-363 10-458 | 12-025 
12 5-288 | 6-937 | 8-561 10-676 | 12-257 





It is of interest to point out that the distribution (1) is also applicable for certain sets of un- 
equal o’s. It can be shown that the general distribution depends on the (k — 2) parameters 


A;Ajis . 
—Piin = [}—— —E ____} = 1,2,..., 4-2), 
suas yee (4;41 +442) ( ) 
where a; = 1/03. These parameters are all equal to one-half when a, = a, = ... = a, but 


it is also possible to find other sets of a’s for which this is so. The reason for this is that, for 
any set of p’s, there are (k — 2) equations to determine (k — 1) ratios, a,/a;,, (¢ = 1,2,...,k—1). 
The distribution would therefore be described more correctly as that for equal p’s. However, 
because of the much greater practical importance of the case a, = a, = ... = a, the descrip- 
tion used in I has been retained. 


2:2. Properties of the distribution of ¥ for equal weights. The moments and limiting pro- 
perties of the distribution are most conveniently approached via the characteristic function 


Pryalt) = > P(1, k) (1— 2it)-4—» 
l=1 


_ (@+1)(¢+2)...(2+k-1) 
ee ee 





where z = (1 — 2it)-4 (see Miles (1959) equation (30)). Hence, for the cumulant generating 
function we have 


k-1 
W(t) = log, A(t) = » log, (j +2) —log, k!. 


The first four cumulants are found to be 
k . k . . 
ky = > Ky = Bt 5}, 


k 
Ky = E (15) 1-95-24 25-4, (2) 
2 





k 
ky = 5 {105j-1— 877-2 + 36)-8 — 6)-4. 
2 


21 Biom. 46 
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Some numerical values of £, and /, are given in Table 2, from which it will be seen that the 
distribution is extremely skew even when k is quite large. In the event of having to test 
significance for large k a curve could be fitted by moments. A x?-curve, fitted using the first 
two moments, gives a very good approximation for k = 10. As k > 00 


(k-1p 0k) 
9) > Tes iyk ~ Fett’ 





using the Euler—Gauss product limit formula for the ['-function. The cumulants of the 
limiting form are thus given by 


kK, = 1.3.5... (2r—1). log k. 


The distribution ultimately approaches normality, but only very slowly, as shown in 
Table 2. 


Table 2. Exact and limiting values of 8, and f, for X? (equal weights) 


























k 10 | 20 50 100 108 108 102 | 
2 pace 4-14 | 310 | 230 ie — 
1 | Limiting 362 | 2-78 2-13 81 | 121 0-60 0-30 
— — | — am a - ie _ ———EE 
B apes 8-88 7-39 | 6-28 575 | — —- {| = 
? | Limiting 8-07 | 6-89 5-98 553 | 64690 «| «(384 | 342 





2-3. The distribution of #’, when the weights are equal, is given by 


a 1k © 
Pr{P > y} = 7) S184 | “pias ar | 
k i 
=> (Sh) 4AM -), 4-0], ®) 
kl 129 
Pr{F = 0} = 1/k, 


where z = (l—1)y/[N-—1+(l-1)y]. 


Tables of percentage points have not been calculated, nor has any simple approximation 
been found. However, if tables of the incomplete B-function are available, significance can 
be tested as described in I. 


3. A TWO-SIDED TEST 
3-1. In some problems it is possible to rank the means under the alternative without 
knowing whether the sequence is increasing or decreasing. This hypothesis may be written 
Hy: m2>m,2>...2> mM, OF Mm >M_, >... > mM. 


A test designed to detect this kind of departure will be called two-sided. The experiment to 
conipare two ulcer treatments (A and B), discussed in I, would have required this kind of 
test had it not been assumed that treatment A would be better than B under the alternative. 
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A further example, to illustrate the method, is given below. The following test procedure 
is suggested. First, calculate ¥? (or F) as if the alternative were m, > mz > ... > m,, and 
then as if it were m, > m,_, > ... > m,. If the two values of ¥? so obtained are denoted by 
y’ and x”, the test criterion is defined as max (X”’, X””). Since %? can be derived by the 
likelihood ratio technique it follows that the two-sided version can also be obtained in 
this way. 


D. J. BARTHOLOMEW 


Example 


The table given below has been constructed from data given by Maguire, Pearson & 
Wynn (1952, 1953) on explosions in coal mines involving more than ten men killed. It 
illustrates the application of the two-sided test and draws attention to a further field of 
application. The null hypothesis to be tested is that the explosions occur randomly at a 
constant rate. The alternative is that the rate of occurrence has either increased or decreased 
during the period. If the alternative stated only that the rate differed from one period to 
another the Poisson index of dispersion would be used to test the null hyopthesis. The 
application of X to the frequencies is equivalent to calculating this index for the reduced 
problem. It has to be assumed that the Poisson frequencies are approximately normally 
distributed; this assumption is implied in the use of the y?-distribution to test the significance 
of the index of dispersion and will be considered adequate here. We therefore have to test 
the homogeneity of a set of normal variables with weights equal to the reciprocals of the mean 
frequency. After carrying out the averaging process indicated on the right of the table we 
arrive at the following two reduced forms. 


Table 3. Application of the two-sided ¥-test to data on the number of explosions 
in coal mines involving more than ten men killed (1876-1945) 

















{ 
: No. of . : 
Period : Decreasing trend Increasing trend 
explosions 
| 1876-85 31 
1886-95 22 20 | | 
1896-05 7 18-5 
1906-15 ae asied 132 | 
1916-25 5 | } 11-0 
1926-35 1 i 1L3 \ | 
| | 14-5 
1936-45 14 | J | 
| 
Mean frequency = 15-43. 
(a) Decreasing trend. Average: 31 22 11 
Weight x 15-43: 1 1 5 
(b) Increasing trend. Average: 13-2 14-5 


Weight x 15-43: 5 2 
It is clear that ¥? calculated from (a) will be much larger than from (b) so we have 
max (X’, X2”) = {312 + 22? + 5(11)®— 7(15-43)7}/15-43 
= 24-85. 








332 A test of homogeneity for ordered alternatives. II 


It will be clear from the following section, and Table 1, that this value is highly significant 
and therefore that the data provide strong evidence for a decreasing trend. This conclusion 
is in agreement with those reached using other tests (Maguire et al. (1953), Barnard (1953)). 

3:2. The distribution of the two-sided test. Let EH, denote the event x” > y and £, the 
event 2” > y. The probability integral of the two-sided test is given by 


P(E, + E,) = P(E,) + P(E,) — P(E, £,). 
By symmetry, P(#,) = P(E,) so that if y is the 100« % point of ¥*, 
P(#,+ E,) = 2a—P(E,E,). 
It can be deduced immediately from this that 
a < P(H,+E,) < 2a. 

In the case k = 3 it is possible to determine P(H,H,) exactly, using tables of the normal 
bivariate integral; these calculations show that it can be neglected by comparison with 2a. 
The following heuristic argument indicates that this will be true for all k. 

If y is large (so that « is small) the events H, and Z, indicate significant departures from 


the null hypothesis in opposite directions. Consequently, the fact that one has occurred 
reduces the chance that the other will also have occurred, that is P(Z,| £,) < a. Thus we 


have P(E, E,) = P(E,) P(E»| E,) < 22. 


Since a is small, we take Pr {max (x, 72”) > y} = 2Pr{x > 7} 


when testing significance, the maximum possible error being a. The existing tables can 
therefore be used if the significance level is doubled. The value obtained for tiie example 
in §3-1 for k = 7 lies well beyond the 1 % point for the two-sided test. 

3:3. In order to provide 5 and 1% points for the two-sided test, 2-5 and 0-5 % levels of 
x” have been computed. They are given in tables 4 and 5 for k = 3 and k = 4. For conveni- 
ence. Table 4 also includes the 5 and 1 % levals already given in I and an additional column 
of 10% points. 


Table 4. Percentage points of X3 
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Table 5. The 2-5 °% (upper figure) and 0-5 % (lower figure) points of X3 


| | | ny | | | 
| | | | | 
\ =o. | | | 

0-0 0-1 0-2 03 | O4 | O58 | O68 07 | 
“to | | | | 








= oat: 
| | 


| | 

| oO | 6862 | — ay an Tene Me 
| 

| 





| A mes jai 

aa | — —- | — | — | = — — 

01 | 6795 | 6-724 i ee es Gee ae fe ee =e 

ee ee ee ee ae. ta en ee i 
0-2 6-725 6-649 | 6-570 ies fe eee hs —_ | =— | 
10-022 | 9-939 sa | — _— — foe fee 

0:3 6-653 6-571 6-484 | 6-391 sows _ a 

9-942 9-852 9-756 | 9-653 —_ — — | — 

0-4 6-575 6-486 6-392 6-289 6-174 a — | — 

9-855 9-758 9-653 9-540 9-411 _ — | — 

0-5 6-491 6-394 6-289 6-174 6-043 5-891 — | — 

9-761 9-654 9-538 9-410 9-264 9-092 — | 

0-6 6-398 6-289 6-171 6-038 5-886 5-702 5462 | — 

9-656 9-536 9-405 9-257 9-086 8-877 8-604 | — 

0-7 6-292 6-166 6-028 5-870 5-682 5-443 5-100 | 4-346 
9-534 9-396 9-242 9-065 8-853 8-581 8-183 | 7-278 

03 | 6163 6-011 5-838 5-634 5375 | 4-999 3841 | — 

9-385 9-217 9-025 8-795 8499 | 8-064 si; — 

09 | 5-991 5-778 5-523 5-183 | 4-591 - , = | = 
9-183 8 948 8-661 8-273 7-577 a ar | — | 

1-0 | 5-537 = a, ae _ — | eae 

| 


4. THE DISTRIBUTION FOR k = 5 


4-1. Unless all the correlations are equal to minus one-half it is not possible to test signifi- 
cance exactly for large k. Existing results suggest that the distribution for equal weights 
may be a good approximation unless there is a large divergence of the p’s from minus 
one-half. Alternatively, the number of groups can be reduced by pooling as suggested in I. 
Fortunately, a great many practical problems fall within the scope of the existing theory. 
However, gradings of the kind—very good, good, fair, poor and bad—or—extreme, 
moderate and uncommitted—occasionally run to as many as five or six categories. It is 
therefore desirable to obtain some results for these cases. In what follows we give a good 
approximation for the case k = 5 and suggest that the same method might yield useful 
results for k = 6. The only difficulty in the determination of the exact distribution lies in 
finding P(5,5). Plackett (1954) gives a method of obtaining exact values which involves 
numerical integration. However, this arithmetic can be avoided by using an approximate 
formula given by the same author. P(5,5) is the volume in the positive quadrant of the 
four-variate normal distribution with correlation matrix 


1 pe, 9 9 
R= Pr 1 poy 9 


0 pox 1 Pos 
> © py 3 
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Plackett gives the approximate formula 7 
l we 
P(5, 5) = re) [cos~ (—py2) Cos~* ( — gq) + $77 CoS? ( — a3)]— 7g det 


valid for small p. It is shown below that this approximation is very good for the present 
purpose when — p> = —Po3 = — P34 = 4, and cannot lead to a large error for larger p’s. 


4-2. Using Plackett’s approximate formula for P(5, 5) and the same method as in I, we oi 
obtain the following probabilities. They are expressed in terms of sin~! since this function Bat 
has been extensively tabled in the J'able of Arcsin x (1945). The formulae have been simpli- Ma 
fied by using the notation of partial correlation. Mac 

P(5, 5) = {sin (— p42) sin“ ( — P3q)}/ (47) Mr 

— {sin (— p42) + sin™ (— Pgs) + sin“ ( — pg4)}/(877) +7, 
P(4, 5) = }—{sin™ (— pp) + sin™ (— pgq)}/(87) Pus 
— {sin (—Py2.3) + sin (—pgg.4) + Sin (— Pgq.2) + in (— P43.2) + Sin™ (— pg5.4) Tat 


+ sin (— P24.3)}/(87), 
P(3,5) = $— {sin (— 42.34) + Sin ( — P43.24) + SiN ( — P4423) + Sin (— Po23.14) | 
+ sin (— P9413) + Sin (— P34.12)}/(87) | 
+ {sin (—/p 2) + sin (— pg) + sin (— pg4)}/(877) 
— {sin (— 4) sin ( — P4412) + Sin (— Pgg) SiN ( — Py4.23) 


+ sin (— P34) sin~ (—Py2.34)}/(47"), | 
P(?,5) = 4—P(4, 5), 





j=2 


4-3. A comparison between the exact and approximate probability integral can be made | 
for the case — Py. = —Po3 = — P34 = 4. In this case the exact value of P(5, 5) is 1/120 and 
the approximate value 1/144. This discrepancy has a negligible effect on the significance level 
as can be seen from Table 6. In fact even if P(5, 5) is put equal to zero the effect is not serious. 


Table 6. Comparison of the probability integral given by the exact and 
approximate formulae for P(5,5) when — py. = —Pog = — Psa = $ 














| | 
" Exact | _ Plackett's P(5, 5) = 0 | 
y | approximation : | 
| | 
N\ | 
5-049 | 0-05000 | 0-04961 0-04765 
8-356 | 0-01000 0-00989 0-00933 





Plackett also gives a table of exact values of P(5, 5) for various combinations of large p’s. 
In all cases P(5, 5) is very small so that any error involved in using the approximate formula 
will be unimportant. In some cases, especially when the approximate method yields a 
negative value, it would obviously be preferable to put P(5,5) = 0. Tabulation of per- 
centage points would require a table with three-way entry, although the size could be 
reduced by making use of symmetry and the exclusion of extreme values of the p’s. 
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The method used here would give a partial solution for k > 5. For example, when k = 6 
we could assume that P(6, 6) was negligible, obtain an approximate value for P(5, 6) and 
determine the remaining probabilities exactly. This possibility has not been explored. 
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THE MULTIPLE-RECAPTURE CENSUS 
II. ESTIMATION WHEN THERE IS IMMIGRATION OR DEATH 


By J. N. DARROCH 
Statistical Laboratory, University of Manchester 


1. INTRODUCTION 


1-1. Ina previous paper (1958), which will be referred to as (I), we discussed the multiple- 
recapture census when the population is closed both to augmentation from outside and 
departure from inside. These restrictions are now removed. 

Let the experimenter take s samples, as in (I). Also, let 1—¢, be the conditional pro- 
bability that an individual dies (or permanently emigrates) between the kth and (k + 1)th 
samples given that it is alive at the time of the kth, k = 1, 2,...,s—1. Let n, be the size of 
the population at the time of the first sample and let ;,—,_, new individuals immigrate 
(or be born) into the population between the (k—1)th and kth samples and be alive at the 
time of the kth, k = 2,3, ...,s. Nothing is assumed about the n, —,_, and they are treated 
as parameters of the model. To treat them as random variables would entail assumptions 
about the manner in which they vary and would complicate rather than simplify the 
probability densities. 

In § 2 we shall take ¢, = 1 in which case n,; is the population size at the time of the ith 
sample, 7 = 1, 2,...,s. In §3, n; = n and the size of the population at the time of the ith 
sample (i > 1) is a random variable with expected value nd¢,...¢;_,. The general case 
when there is both immigration and death will be considered in § 4. 


1-2. The main aims of this paper, as of (I), are to provide exact, fully stochastic models 
for the observed frequencies of individuals, to show how simply these frequencies naturally 
group themselves, and to obtain estimates of the unknown parameters. When there is 
immigration only or death only, the estimates are shown to be asymptotically efficient 
and their variances are found. In addition, a method of performing tests on the values of 
the parameters is given. When bothimmigration and death are operating, on the other hand, 
the complexity of the probability density prevents us from going further than obtaining 
the estimates and merely indicating how their variances can be found. 


Both in (I) and in the present work we have been unable to obtain satisfactory tests of 


the underlying assumptions of the models, notably that tagged and untagged animals are 
captured with equal probabilities. (The above-mentioned tests on the values of the para- 
meters assume the truth of the models.) We hope to fill this gap at a later date. 


1-3. We gave a brief review of the literature and a list of references in (I) but, unfortun- 
ately, omitted to refer to a paper by Leslie, Chitty & Chitty (1953), the third of three papers 
by these authors. These three papers contain several ingenious mathematical approaches to 
multiple-recapture problems, together with a very full discussion of field data on popula- 
tions of voles, and are uniquely valuable in the way that they dove-tail theory with practice. 


One of the points emphasized by the authors is this: although a basic feature of the multiple- 


recapture census is that caught animals are returned to the population alive and unhurt, 
in practice a few are either accidentally killed or have to be removed from the population. 
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This eventuality has not been taken account of in the following pages but we note here that 
it is easy to do so. Each of the usual observed classes ‘caught in the i, ...,1 samples but not 
otherwise’ is coupled with an extra one for those animals which are caught in the same 
samples but killed or removed at the /th. Also, an additional parameter is introduced to 
represent the probability that a captured animal is killed or removed at any sample. The 
estimates of the other parameters change slightly but, apart from this, there are no 
complications. 


2. IMMIGRATION BUT NO DEATH 


2-1. In this section, suffices 7, j take all values from 1 to s while suffices k, | take all values 
from 2 to s. 

Let p; (= 1—q;) denote the probability that any member of the population is caught at 
the ith sample and let a; be the size of the ith sample. Let a_,;, be the number of individuals 
caught before the kth sample and a_,,_, the number which are caught before and at the kth 
sample. Further, let w denote any non-empty subset of the integers 1, 2,...,s and w,, the 
number caught in every one of the w-samples but not otherwise. Let 

P= Vy = VUYt DY Ug t..-+Uye...93 
w i i<j 
the total number of individuals caught in the whole experiment. 

The notation thus far is the same as in (I) and suffices for most of §2. However, for the 
purpose of deriving the density p[{u,,}], it is convenient to have a further notation. Namely, 
let a,, be the number of individuals caught in every one of the w-samples, regardless of 
whether or not they are caught otherwise. 

We begin by finding p[{w,,}] for s = 3, using a chainwise argument. The probability of 
catching a, individuals in the first sample is 


n 
( ) pig, 
ay 


Given a,, the probability of catching a, in the second, of which a,, are common to the first 
and dy — yp. are new ones, is 
ay No _ ay a N.—a 
P2?q27 "*. 
Aye) \42— A 


Given @,, dg, dy, the probability of catching a, in the third, of which a,., are common to the 
first and second, a;—4 9, to the first only, d23—G@ 3 to the second only and of which 
As — Ay3 — Agg + Ayo3 are New ones is 


‘tal ( a, — Ayo ( Ag— Ay ( Nz — A, — Ag+ Ayo ) viva 
449) \Ay3— 493) \G25— 4123) \dg—413—G2g +4423)" 
Multiplying these three expressions together, cancelling and renaming the terms involved, 
we obtain 


n,! (nmx—Ges)!(mg—aez)! 1 3 
ee Bas< iene _.. Some 3 eee bit coded 4; gMi-j, 1 
Plittot] = (aca)! (g— dea)! (mg—r)! They Lh Pha () 
w 
Note that, if n; = n, (1) contracts to 
n! 
Il piiq?-™, 


(n a r)! Il Urp! i 
w 


which is the model forming the basis of (I). 
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(1) may also be written 





n,! 
PL{t.}] = (n,—ac,)! wu, Fag Uza! Uy9! Qi1—*<2 (91 929s)" (P1P 29s)" (9192 Ps)? (1 Pops) 
(%2— <2) 





Nz —Aes)! , 


Another way of exhibiting the distribution of the wu,, is by means of their generating 
function y[{t,,}] = E[T] t%]. It is soon shown that 


WU {tot] = (419293 + P1929ati + Pi PoIstie + P19ePstis + PiP2Pstios 
+91 Poste + 1 P2Pstes + 9192Pst3)™ 
X (9293 + Poste + P2Pstes + JoPsts)"* ™ 





x (73+ Pats)". (2) 
2-2. Let us define a-, = 0 and a.,,, = r. Then for general s, (1) becomes 
1 N;——;)! ee 
lite] = a TT gig nem 3) 


TI te! t (M;—@ci41)! 
rte 
Maximizing with respect to p; by equating (0/0p,) log p[{u,,}] to zero, 
1D, = 4. (4) 


Maximizing with respect to n; by equating A,,,log p [{w,,}] to zero, 


Ni-Geiy  x 
oo nf. (5) 
Nj — ej; 


The equation for i = 1 in (4) is the same as that in (5) and there is consequently no in- 
formation on n, and p, separately. Otherwise, combining (4) and (5) and remembering that 
Wey, = Vey td, —Ae,.,, We obtain 


~ Gek.k 
os heaker mii (6) 
Wx 
_ Fee % 
and _-S 
<k.k 


provided a_, ; > 9. 


2-3. Having found maximum likelihood estimates from the density (3) we cannot 
apply maximum likelihood large sample theory to them directly, for reasons given in 
(I) §3-2. The principal obstacle is the presence of the n; in the factorial terms of (3). To get 
over this difficulty, we use a suitably modified version of the argument of (I) § 5-7. That is, 
we apply maximum likelihood large sample theory to a density which is conditional on 
some of the sample variables and afterwards take expectations over these variables. In 
this case, the most suitable conditional density is p[{w,,} | {ae;4}]. 

The conditional distribution of a-;,,-—a-,; given a_; is B[n;—ae,, p;] (including 7 = 1 
when the distribution of a, is B[n,, p,]) and therefore 


place] = 11 


N;—Ae; 
: = ) picteractggeraeter, 


DMei41 - Vj 


1 
Pl {uw} | {@cis1}] => II (Qej41—Ge,)! TI] pi<*- "qt O<k-k. 
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Reverting for the moment to the parent density, we see that 


Ty, = (Wegy1—Vex)/Pet+ Vex, (8) 
where #, is given by (6). This equation will prove useful for two reasons. First, a., and 
@<,z4, are held constant in the conditional density (7), and secondly, , is the maximum 
likelihood estimate of p,, for the conditional as well as the parent density. In view of (8), 
it turns out to be preferable to change from p, to 0, = pz}. (8) then becomes 


Ry. = (Acker — Ven) + Aer: (9) 
Let L = L{{O,}] = log p[{w,,} | {4<:4:}]- Then 
oe Sch, Ca Sonn 





~~  & 6-1” 


which, when equated to zero, gives 9, = a2,/a<;,.;, provided a., , > 0. Further 


oL ph ie 
- ggg <ul] = as ay 
OL 
- 8| 35-29, | <i+)| =0 (k+l), 


yielding the asymptotic formulae 


2 —_— 
E{(4,—9,.) | (Geiss) = F(On— 1) 


Ge, 
EG, — O,) (9, —) | {Q<i4s}] = 0. 


When a.,., = 0, the Taylor expansion of 0L/00,, at 6, about its value at 6,, is meaningless 
since there is no finite solution for 6,. However, the resulting invalidity of the last formulae 
is easily avoided by changing from 6,, to 

be 
me Men ntl’ 





since 6}, is always finite and has an asymptotic variance differing only negligibly from that 
of 6,, over the finite part of the latter’s range. (This can be shown using the é-technique.) 
Moreover, 0}, has a negligible bias. For, given {a-,,,}, @2,.;, is Blac, p;,]. Therefore 


E[O;, | {A <iss}] = El; | Gx] = (1 —9h<**) 
and the difference between this expression and 0,, may be neglected. 
Two further properties of the 0; should be mentioned here. First, they are independently 


distributed given {a_,,,}. This follows from the independence of the variables a_, ;, a 
property which is proved without difficulty. Secondly, using the d-technique, we find that 


E[(4,—- 0,,)? | {@<i+s}] = E{(6;,— 6,,)? | a<;] 


Ger ae 
a Ae, > 0. 
Returning now to the estimation of the n, and recalling (9), we replace i, by 
Ny, om (Mensa - Aj) OH, + ep, 
(tnt 1) (+1) _ 
Ae pt] 


Now My — My = (Megs — Vex) (Ae— Ix) + yl (Ce nea — Vex) — (Mp, — Mex) Py]: (11) 


that is, by ny, = 1. (10) 
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Therefore, taking expectations conditional cn {a—;,,}, 


El (my — Mx) | {O<ensh] = — (Gens — Ven) ESE + el (Mensa — Ven) — (Mp — Vex) Px]: 
Now taking expectations over des, ...,Mep_45 Gejayas +13 Vegyts 
E[(nj,—M) | Gx] = —(Me— Vx) PrIRGES 


= (1, + a <j) qk< ett, 
since @2,.4,;—G<; = Bln, —a<;, p;,]. We shall neglect this bias. 
Squaring (11) and taking expectations, 
El (mj, — 1.) | Cex] = [(%,— Gp)? PEF (Mp — Vex) Pre Me] EL. — 9)? | Vx] 
+ 20,.(my, — Gx) PM EL (Ox — 9) | Vx] 
+ OM, — Vex) Pree (12) 
In deriving an asymptotic formula for H[(nj,—,,)*], a convenient limit process to use is: 
n, > co such that n,/n; > c;;, constant. To evaluate the expectation of the first term of the 
right-hand side of (12) over a z;, consider separately the ranges 0 < ae, < hn, and 


hn, < de, < N,_1, where h < 1—4q,...q,_, and hn, is integral. Now az, is the sum of in- 
dependent binomial variables Bi[n,, 1—4q, ...q,-1], ---» Bln,_1 — %p-2; Pps] and therefore 


Place, < hn] < PLB[ny, 1-q..- a] < hn] = (5, em), 
\ 1 / 
— (l-he Ga > +++ We-1 ~ 
where c= ( h l-h < 1. 


Also 4;, can never exceed a_,,+1. Therefore, for the range 0 < az; < hn,, the contribution 
of the first term of (12) is at most O(n? n2nz4c™) = O(n$c%) = o(1). For the other part of 
the range, we may use the asymptotic formula for the variance of 6;,. Since the second 
term on the right-hand side of (12) is o0(1), we may thus far say that 


E[(ni,—n,)"] =  B (mean) E+ (eu) Pe) 


adek>hny, 


O20; — 
(7 ?+0(2-))|+ E [Oj(m.— <x) Ped + (1). 


hep bck ack>0 
Let hy, = Ela cy] = (1 — Gy «++ Yp—a) + oe + (Mpa — M2) Pra 
Then, expanding 1/a—;, about 1/«—;, and taking expectations over a—,,, it will be found that 
. 07 (Mp, — Hex) Uy 
E{(n},—n,)?] = mM — Ln) Mh +0(1). (13) 
ben Pr 


Note that, when all n; = n, (13) becomes 


a 41-°-%U 
Alm ned re eee, 
which was derived previously in (I). 


Thus, the nj, are almost unbiased estimates of the n;, with variances given by (13) and, as 
is easily shown, with negligible covariances. Further, nj, is the asymptotically most efficient 
estimate of the class of estimates 


* _ * 
Ne = (apy — Vey) OF + Mey 


for which El (Of —9,.)? | dep Mepgal = & + o(-3-) 
Wer Mek 
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and for which E[ (0% —9,,) | a<;,,%<;,41] is negligible. This is a simple consequence of the fact 
that, since 6}, is asymptotically efficient for 0,,, 07(0,,— 1) is the minimum attainable value 
of A,,. In (I) we pointed out that n; may be evaluated from ‘similar’ tagging and that, if 
n; = n, the nj, contain between them all of the information on n. 
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2-4. Suppose that catchability is constant throughout the experiment and that e; 
units of effort are expended in catching the ith sample. Then 
q, = oY, (14) 
where the unknown coefficient of e; is written as a reciprocal parameter for much the same 
reason that, in § 2-3, we changed from p,, to 6;,, = pj; +. In previous effort models, it has been 
customary to make the expected catch size proportional to the effort. This is equivalent 
here te writing p; = e;/y which involves a relative error O(e;/y), and the equivalent approxi- 
mation for q; is g; = 1 which is obviously too severe. Therefore, accepting q; = 1—e;/y as 
a satisfactory approximation to (14), that is, neglecting terms which are relatively O(e?/y?), 
the approximation of the same order for p; is p; = e;/y(1—e,;/2y). When it is expedient to 
do so, we shall use these expressions. The fact that they lead to p;+q; = 1—e¢3/2y? is not, 
of course, inconsistent with our decision to neglect terms O(e?/y?). 
Again, consider the conditional density p[{w,,} | {a<;,,}] which gives rise to the likelihood 


eX”) = TT (1 —e-*e7) <b. k (e-Cn/V)4<k-Gckk, 
k 





oL 1 ae ae 
Hence ay = oP Looney O<hee . 


Equating this to zero, the equation for 7 is 


Vek. nek . 

—— EE = Dae,e 

F (el) (1—e,)39) ~ “<** 
and replacing (1 —e,./2¥)~! by 1+e,,/27, we find that 


D (<4 — $4 <x, We 


This formula breaks down if ¥ a—;, ;, = 0 (and so does the whole experiment since it has not 
k 


yielded a single recapture). Therefore, instead of 7, consider 


~ (Gap — Bp 4) Oy + BE 
7* ace. k pt 1’ 
where @ = Sye,/(s—1). y’ always exists and we suspect that it is almost unbiased, for the 
k 


following reason. [f all e,. = @ and p, = p say, 


LVae,t1 
_k 


and Lae; ;, = UB[ae;,, p] = BlXae;,, p] since the ae, ;, are independent. Hence 


’ _l ed " 
Ely | {4<542}] “3 i — Q=@<k+1] — he. 
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Neglecting ¢°*<*+! and replacing p by (é/y) (1—2/2y), 
Efy’ | {Q<j413] soit 


In the general case when the e, are unequal, it still seems reasonable to assume that y’ is 
almost unbiased. 
To obtain E[(y’ — y)? | {a<;,,}] we may invert 


#|— | fa | a CI x|P 
Oy? | <s+1 os Vk Tk! Ph 
= Yh Gn Px 


on making use of our approximations for p, and q,. The d-technique shows that 


Bly’ — V9 | fell = sq + O(a caa)™ 
From now on, the argument follows very much the same lines as in § 2-3. For the parent 
density Ty = (Weis — Vi) Di + Vai 
and ¥ has the same value as for the conditional density. We replace it by y’ and %,; by mj say, 
defined as mi = (d<i41- <del) (1—e,/2y') + a4. 


, (a a 

Hence m,—nN; = — <i) (y'-Y¥) +o, * (ae i+ — cq) — (Mj — <4) M,)- (15) 
i Pi 

Taking expectations conditional on {a—,,,} and then over all a-;,, except a.,, we see that 

m; is an almost unbiased estimate of n;. Further, squaring (15) and omitting the region: 

0 < ae, < hym,_,, where h, < 1—q,...q,-, and h;,n,_, isintegral, all k, when taking expecta- 

tions over {a<;,,}, we find that 





— se 2 
El(mi,—n)%) = Mei Pt _Y + (My— <4) Gl Pe + OU) (16) 


for the limit process: n;-> 00 such that n,/n, is constant, all i,j. Besides being almost unbiased 
mi, is also most efficient of the class of estimates m? which satisfy obvious conditions. 

It is worth examining the gain in information that constant catchability and knowledge 
of the effort bring. The information on n, provided by m; represents a total gain as there is 
none when the effort is not known. Otherwise, 

y _ var (my) — 1+ (Geng — Cap) /H <n PR 


‘k= —— cata (bare. —ti_e 


var (mi) 1+ (Gens —%ex) p> heyy’ 


on replacing (7), —<;,) py, by %;,,, —%<, and eZ /y* by pi/q;,.. In order to evaluate F;, numeric- 
ally, some assumptions must be made about tie p, and about the rate of immigration. 
First of all, suppose that p, = p for all k. Secondly, suppose that a —,,,—@<;, is constant 
for all k; that is, the expected number of new individuals caught at each sample is constant. 
Taken in combination with the first assumption, this implies that n,—;_, = n,p for all k; 
that is, the number of immigrants between consecutive samples is constant and equal to 
the expected size of the first sample. The formula for Fj, can be shown to be very insensitive 
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to the particular assumptions made about the rate of immigration, and we have merely 
chosen those which make the formula as simple as possible. It now becomes 


pr _ 1+lk-)pr" 


* 1+[s(s- 1) p/2}7 





and is tabulated below for s = 5 and three values of p. For p < 0-001, the formula 
Fi, = 3(8 — 1)/2(k—1) provides a very good approximation to the previous one. 


























; | 

2 3 4 5 | 

0-001 9-91 4-96 3-31 249 

0-01 9-18 464 | 312 2:36 | 

| Qe 5-50 300 | 217 175 | 
| 





When we come to estimate the number of immigrants between samples, knowledge of the 
effort is doubly advantageous and results in a very considerable gain in information. For, 
var (m;,—m,_,) = var (m;,) + var (m};,_,) — 2 cov (m},, m;,_,) and not only are the two variances 
smaller than their counterparts var (nj) and var (n;_,), but their covariance is large and 
positive whereas cov (n;,, ;_,) is negligible. In fact, 





(@<i41 — Xi) (Hejy1— Ly)” +O(1) (i+), (17) 


cov (mj, m5) = 
C50; Up Dy 


and, making the same simplifications as we did for F',, we find that 





, _ Var(m,—,_,) _ 1 1 1 


be Views, 22 nics (Seog: Eeeerenee = $,...,8). 
° var (m},—™M}_,) ‘+? mites (h 8) 


G;, is tabulated below for s = 5 and three values of p. 











0-001 751 | 418 293 











0-01 76-0 | 42-7 30-2 
0-1 8-50 5°17 


2-5. We mentioned in (I) that the capture-recapture model used by Hammersley (1953), 
when dealing with a large accumulation of data on the ringing of Alpine Swifts, contains 
a flaw. It is appropriate here to show how this comes about. 

Hammersley effectively made two postulates: that the overall likelihood could be taken 
simply as the product of the individual likelihcods for the captured birds; and that for any 
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such bird, its likelihood need only record its behaviour onwards from the sample in which 
it was first captured. For s = 3, the likelihood arrived at in this way is 


(19293)"* (P1 P2493)" ( P1923)" (Pi P2 Ps)" (P243)"? (P2Ps)"™ 3. (18) 
which may also be written pt p$rq$<s—%2 p$sqh-%s, 
or, for general s, Il ptigg<itr-%, 
i 


The maximum likelihood estimate of p; is a;/a-;,,. Dividing this into a; to obtain an 
estimate of n;, we get a—;,;. In other words, the population size at any time is estimated by 
the number of individuals captured up to that time, which is plainly unsatisfactory. The 
reason why this model is at fault is indicated by comparing (18) with (1’) and with (7) for 
s = 3. The latter can be written 
(dc3—G<)! 


1 (42%3)"4 (P29s)"* (92Ps)"3 (Pos) — | 9823. 


Aes! 
Pl{tho} | {4 <i+}] = — si Uy! Ugg! 
(19) 


Uy! Ue! Uy3! Uy93 


A remark worth making here is that, even if (18) is replaced by a true density such as 
(19) and n, is estimated by ‘ 
h; = a,/;, 

the sampling variance of 7; is not only attributable to that of ; but also to that of a; (and 
also to their covariance). This remark is of course implicit in §§ 2-3 and 2-4 where, for other 


reasons, we took if . 
Re = (Weir —%e5)/Pi +My: 


3. DEATH BUT NO IMMIGRATION 


3-1. In this section, suffices 7, 7 take all values from 1 to s and suffices k, J all values from 
1 to s—1. p; is now the conditional probability that an individual is captured at the ith 
sample, given that it is then alive. Let a.,, be defined as the number of individuals caught 
after the kth sample, with a corresponding definition for a, . ,. 

Let x, be the conditional probability of an individual not being caught after the kth 
sample given that it is alive at the time of the kth sample. Then, for s = 3, 


Xi = 1-91 +9192(1— G2) + 91929295 
= 1—$,P.— $1 9292s; 

Xe = 1-G2+ 24s, 

1—¢5D3. 


Let 7, denote the probability that an individual is never caught and 7; the probability 
that it is caught for the last time at the ith sample. Then, for s = 3, 


TM =UX1 m1 =PixXv T= O1P2Xe TM; = $1b2Ps 


and Moy t+ My +1.+ 73 = 7. 


which has an obvious probability interpretation. 
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The probability density of the w,, is 


! 
PL{uy}] = anibadd 7m" Il Pi, 


where, for s = 3, Pi =, Ps = 419273; 
Py3 = P1273; 

P,= 7%, Pog = U1 P27; 

Pix = P1M2,  Pyo3 = Py P27. 





Hence, 
} = a. Si UP at —O 251-45 9778 >2%1.>1982.>2qg9>1-1. >1qg>2-%2.>2 
Plt }] = (n—r)! Tu ge yas ee Pr 7192 = 
~ Ww 
3:2. Define a,, = r and a., = 0. Then, for general values of s, 
n! 
PL o}] = —— 5 197 TI fm t-1- >t TT pet egg> ete > (20) 
Wo] = Geom ire,! Hm L 
w 


Besides n, the original parameters were {¢,} and {p;}, a total of 2s—1. We have now 
changed to 719, {77;} and {p,} subject to the constraints 


M+d47,=1, and mp,—7,9q, = 9. 
i 
The effective number of parameters is therefore 2s— 2, a reduction of one. The reason for 
this is that, if p[{w,,}] is written as a function of {¢,} and {p;}, ¢,_, and p, only appear in 
the combination ¢,_,p, and are therefore really only one parameter. In other words, they 


are non-identifiable. 
Maximizing (20) with Lagrangian multipliers A,, A, for the two constraints, we find that 


4=0, AQ—4, 


» _ Sok (21) 


Let us define ®, = 1 and otherwise 
®; = $,92--- $i-v 
the probability of survival up to the time of the ith sample. Then 
DO, Py = Met PHMe pat +++ +75), 
with obvious probability interpretation. Therefore 
n®,, = NT;,|Py + (NM p44 + --- +NT,). 


Now let %, denote the estimate of n®,, which is the expected population size at the time 
of the kth sample. Then substituting the above estimates of n7;,, 





Ty = (A>4-1—F>x)/Pet+4>x (22) 
bible ~  %&@ 
and hence, substituting for #,, %, =~ =#. 
a .>k 


22 Biom. 46 
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3-3. At this stage, the argument becomes so nearly parallel to that of §§ 2-3 and 2-4 that 
we can omit almost all of it. The starting point is the comparison of (21) and (22) with (6) 
and (8). Next, n—r and {a,,;_,—a,,} are distributed multinomially with parameters 
n, M7, and {7}. Therefore, 

n! 
n—r)! Il (@5;4- a, ;)! 
t 





pl{as i-v] = ( 


mo II 1G>i-1—4>4, 
a 


1 
Hence PL {uo} | {2>:}] = Tiz,! TT (45 ;-1— ><)! TI pet->*qg> 7. >, 
w t i 
w 
which should be compared with (7). 
Let nf, = e+ 1)(@>e +1) _ 


1. 
Ay >e + 1 





Then nz is almost unbiased for n®, as n, was for n,. There is, however, an interesting 
additional term in the expressions for var (n;,) and cov (n;,, n;) which was not present in the 
corresponding formulae for nj, n}. We find that 


(n®;, — & 5.) ND, 4), 
4>KPr 
as n -> 00, all other parameters remaining constant. The second term is the variance of the 
actual population size at the time of the kth sample: In the same way, 
cov (nz, 27) = 2O(1—@,)+0(1) (k <l), 
and the first term is the covariance of the actual population sizes at the times of the kth 
and /th samples. 


In (I) we remarked that, if all d, = 1, the nz contain between them all of the information 
on ” but that their evaluation requires ‘differentiated’ tagging. 





var (n;,) = +n®,(1—®,)+0(1) 


3-4. If q; = e-*” and the e; are known, it is still true that 


Nt, = A>; 1— A>}. 
Hence AD; = (4,;-1—As,)/Pi + 4s;. 
Ay }.— 3a e eé 
Let po a > : k.>k) e+ 
i 2p > e+ 1 ; 


the almost unbiased estimate of y and let m‘; be the corresponding estimate of n®;. The 
terms n®,(1—©;) and n®,(1—®,) appear in the formulae for var (m;), cov (mj, mj), i <j, 
in addition to those corresponding to (16) and (17). But, as far as Fy, = var (nz)/var (m;) is 
concerned, the additional terms may be omitted because, although they are of the same 
order, O(n), as the others, they are negligible by comparison. Therefore, 


F= A+ (Grp - &>4)/%>n Pr 
i= 


1+ (45,1 H>x)/% 1) 





The simplifying assumptions are now that p, = p and a.,_,—a,, is constant for all k. 
Taken in conjunction, these two assumptions imply that n®,(1—¢,) = n®,p,, that is the 
expected number of deaths between consecutive samples is constant and equal to the 
expected size of the last sample. With these simplifications, 


Fi=F., (k=1,2,...,8—1). 
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var (mj_1 — Mp) 
var (mj,_,— m4) 


Also, & = =G., &=2%,...,0—1). 





The experimenter will probably be more interested in estimating the survival probability 
¢,-1 than n®,_,—n®,, the expected number of deaths. Both nj/nz_, and m;/m;_, are 
biased estimates of ¢,_, but the biases are O(n-") and can easily be evaluated, if desired, 
using the d-technique. The d-technique also shows that, neglecting terms O(n-), 


var (n}/ N-1) _ var (n,— Ppa My-1) 
a n” 
var (m;/m,_1) var (m,—Pp_1™M,_1): 





and, provided ¢,_, is not too much less than one, the latter ratio is well represented by G. 


3-5. Throughout (I) and so far in this paper, we have been exclusively occupied with 
estimation and have said nothing about testing hypotheses. In this subsection, the likeli- 
hood-ratio method will be adapted for the purpose of testing hypotheses about the values 


of {f,} and {p;}. 
Consider any density of the form 


PL{%w}] = Qr- , Il Pu, 


(n- wont T Uy! 
where Q and the P,, are functions of parameters {6,} say. Let the log-likelihood of n and {6,} 
be L, = log p[{u,,}]. Next, consider the conditional density 

UI Pas 

2a -Qy" 


PL{uof | 7] = 








and let the corresponding log-likelihood of {6,} be L, = log p[{u,,} | r]. 
Let £, and £, denote the maximum values of L, and L,. We show first that, to a good 
approximation, L, differs from L, only by an additive constant. Equating 0L,/00, to zero, 


n—r 0oQ Teg OF ey 
“@ 0,2 P26, ~° -” 
Equating A,, L, to zero, log n—log (n—r)+log Q = 0. (24) 
(24) gives (n—r)/Q = r/(1—Q) and, substituting in (23), we have 


r 0 Uy OP, 

1-0 20,1 =P, 0," 
But this is seen to be the equation 0L,/00, = 0. Therefore, the maximum likelihood estimates 
Qand P,, of Q and P, are the same for L, and L,; and for L,, % = r/(1—Q). To find £, and L,, 
an approximation for n! is required, and we shall assume that n! = Ke-"n”. (This approxi- 
mation is equivalent to our practice of maximizing L, with respect to n by equating A, L, 
to zero rather than 0L,/0n. For, if the exact identity A, logn! = logn is replaced by the 
approximate one dlogn!/én = log» and the latter is integrated, it leads directly to 
= Ke-"n. Using this approximation instead of Stirling’s, n! = ,/(277) e-"n"+4, results in 


22-2 
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an increase of } in the maximum likelihood estimate of n, a difference which can be ignored 
as it is O(n-).) Hence, 


L, = fiflog % — 1] —(#—1) flog (4-1) — 1]—log [] u,,! + (#—r) log Q+ Dd u,, log P,, 


= r(logr— 1]—log [J u,,! —rlog (1—-Q) + Su, log P,, 
w w 
= [,-log K, 


which proves our assertion. 

Let H, denote a hypothesis which reduces the number of degrees of freedom of {0} by 
v and H the alternative hypothesis allowing {0,} full freedom.Then, as L, has the properties 
required for maximum-likelihood large sample theory, we can say, using self-explanatory 
notation, that 
e's Len, - L,, 7) 


is approximately distributed as y?. More precisely (Bartlett, 1953), 


Pla < a, < b|r]= [ foaae +00) as 100. 
Let a= —2L, 5,—Ly,n). 
Then, since L, ,, = L,,,—K and, similarly, £, ,, = L. 4,—K, 
2, = Xz. 
Therefore, if P, is the probability of r individuals, 
Pla<2,<b]= > PPla<a,<blr]+ ¥ PPla<2x,<b|r, 


rshn r>hn 


where 0 < h < 1—Q such that hn is integral. Now 


¥ BPla <2, <r] < D B= O(n-tec) = o(n-), 


r<shn rshn 
y 1-Q h Q 1-h 
where c= (5°) (5) <i. 
b 
Aso, E BPla <a, <d|ri= ¥ lf F083 +0] 
r>hn r>hn a 


b 
= [1-o(n)] [ fo8axd+0en). 
Hence, Pla < x, < b) = "P08 dx? + O(n-) 


which is in the same form as the usual likelihood-ratio test. 

There are two applications of this test for which the algebra has already been performed. 
In each case H denotes the hypothesis allowing complete freedom to the parameters {¢;} 
and {p,} and it is therefore easy to find L, ;,. As the first example, consider Hy: ¢, = 1 all k. 
Lyn, can be found from (I) and v = (2s—2)—s = s—2. (If the test dictates that H, be 
accepted it will of course not mean that we believe it to be exactly true since this is 





} by 


rties 
tory 


yrmed. 


rs {d,) 
lallk. 


Hy be 
this is 











J. N. Darrocu 349 


strictly impossible for an animal population. It will mean, rather, that the ¢, cannot be 
regarded as reliable estimates of the ¢,.) As the second example, consider H,: g; = e~% 
where the e; are known. This will test whether or not catchability is constant. L, H, can 
be found from § 3-4 and, again, v = (2s—2)—s = s—2. 

Finally, a remark concerning hypothesis testing when there is immigration but no death. 
The above adaptation of the likelihood-ratio method is not applicable to the density (3). 
There is, however, a way round this difficulty. We can set up a formal model for immigration 
but no death which is the exact inverse of the model for death but no immigration. Namely, 
let the population be of size n at the time of the last sample and let 1—y,, be the con- 
ditional probability that any individual immigrated between the (k— 1)th and kth sampies 
given that it was in the population at the time of the kth, k = 2,3, ...,s. This model yields 
just the same estimates as (3) and any likelihood-ratio test can be written down automatic- 
ally from the corresponding one for death but no immigration. (For this model p[{u,,}] is 
most quickly arrived at directly as a multinomial probability, but it can also be developed 
as a chain of conditional probabilities P[S,] P[S, | S,]... P(S,| Sj, ...,S,_] just as easily as 
for any other model. The main snag about this description of immigration is that it cannot 
be combined with stochastic death. To include both immigration and death, densities (3) 
and (20) must be ‘combined’, and we do this in the next and final section.) 


4, IMMIGRATION AND DEATH 
41. Fors = 3, the generating function E[T] t%~] of p[{u,,}] is clearly (cf. (2)) 
w 


(1X1 t+ PiXrts + P11 PaXatie + P1P192P2Pstis + Pi Pi P2P2Pstios 
+991 P2Xotot+ 91 91P2P2Pstes + U1 9192 P2Psts)”™ 
X (F2X2+ P2oX2te t+ P2hoPstes + I2P2Psts)"*™ (ds + Pats)", (25) 
where y, and x, have been defined in § 3-1. The coefficient of [] tj in (25) turns out to be a 
w 


double sum of probabilities which only contracts to a single probability if n, = n, or d, = 1 
and if n. = n, or dy = 1. 

In general, p[{u,,}] is an (s — 1)-dimensional sum of probabilities which cannot be written 
as a single expression unless, in every interval between successive samples, there is either 
no immigration or no death. 


4:2. The unwieldy form of p[{w,,}] clearly rules out estimation by maximizing the like- 
lihood, but there is an obvious pointer to an alternative method. In all cases when there is 
no immigration or no death, the method of maximum likelihood is equivalent to equating 
some of the class sizes to their expected values.We recall from (I) that, for a closed popula- 
tion, these are {a,;} and r. It is easily shown that, for immigration but no death, they are 
{a,}, deg, ...@<, and r, and, for death but no immigration, they are {a,;}, r and a,,, ...,d,_». 
Therefore, when there is immigration and death, it seems eminently reasonable to equate 


{a,}, Deg, +++,Meg, 1, A>y,+++,A>5-2 (26) 


to their expected values. We shall indicate how to solve the resulting equations by sup- 
posing that there are four samples, this being the minimum number from which the general 
solution may be inferred. 
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For s = 4, the aforesaid equations can be arranged as 


ay = %4P}, (27a) 

Ag = [N91 +N_—N4] Po, (276) 

As = [Np 2+ (M2— M1) by +3 — Ne] Ps, (27¢) 

A, = [N91 $293 + (Me—Ny) Pohs + (M3 — Ne) Pg + Mq— Ns] Pa, (27d) 
T—A>1 = MP1XrD (28a) 
A514 — Ase = [MP1 +N.— M4] PoXo» (285) 
As2—As3 = [M912 + (M2— My) Pa +3 — Ns] P3X3, (28¢) 
Deg = MP}, (29a) 
Beg—Gg = [M919 + N2— 4] Po, (295) 
Aeg—Geg = [M1 91929192 + (Me— M1) Pode + Nz — Ne] Ps, (29¢) 


where (27a) reappears, for convenience, as (29a). 
Combining (27 a, b,c) with (28 a,b,c), we have x; = (@5,_1—45,)/a, orl — x, = Ay. 4/4, 
k = 1, 2,3. There is a recurrence relationship between successive x;, which may be written 


$x M4101 — Xv41)+ Pr Prira=1—-X_, (kK = 1,2). 











a a 
Hence $142 = +P. = —" ’ (30a) 
a a 
$243 = + $23 = - a (306) 


If we multiply (29a) by $,p., add it to (296) and refer to (27b), we obtain 


Aeg—Ghegt+ Py PoVes = a, 
or Py Poe = Veo.o (31a) 


Similarly, multiplying (29a) by ¢,¢op5, (296) by ¢,p, and adding them to (29c), 


$2P3( <3 — Gea) +91 PeP3%eg = Wez.y- (316) 


Equations (30) and (31) enable us to solve for po, p, and ¢,, ¢,. Hence, from (27 b,c), we 
can estimate n,¢,+n.—n, and n,$,¢.+(Ny—N1)bo+Ng—Nq, the expected population 
sizes at the times of the second and third samples. Also n,—., the number of immigrants 
between these two samples. We note that (27d) has not proved to be of any use. 

In general, it is possible to solve for py, ...,p,-, and hence for the expected population 
sizes at the times of all but the first and last samples. Also for the survival rate in all but the 
last interval and the number of immigrants in all but the first and last intervals. The re- 
maining parameters cannot be estimated and it is fairly obvious that this would hold true 
of any method of estimation. 

Since the distribution of the variables (26) is simply a combination of independent multi- 
nomial distributions, it is theoretically straightforward to find formulae for the variances 
and covariances of all the estimates by using the é-technique. However, it would be too 
tedious to compile these formulae for the general case and we shall not attempt it here. 
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4-3. If g; = e-/Y and the e; are known, we can again appeal to moment equations for 
the purpose of estimation. In (I) we remarked that, for a closed population, maximizing 
the likelihood is equivalent to equating ¥ a;e;/p; and r to their expected values. When there 

i 


J. N. Darrocu 


isimmigration but no death the corresponding variables are > a;¢;/p;, As (= 44), Meg, ---, eg 
i 
and r, and when there is death but no immigration they are }a;e;/p;, r and a.,,...,4.,-», 
i 


a,,-, (= 4,). With both immigration and death we therefore equate Xa; e;/p;, Gee, -.-, Aes, 
1, M4, -.-,4,,_, to their expected values, thus providing 2s equations. These enable us to 
estimate all of the 2s unknowns y, {n;} and {¢,}; except for s = 2 when the a, e,/p, + d2€9/p. 
equation is implied by the a, and a., equations and there are only three equations for the 
four unknowns. 

Without actually performing the straightforward but tedious task of finding the variances 
of the estimates, we can be confident that knowledge of the effort provides very considerable 
gains in information. 


I am indebted to the referee for his most helpful comments. 
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ON AN EXTENSION OF THE CONNEXION BETWEEN 
POISSON AND x? DISTRIBUTIONS 


By N. L. JOHNSON 
University College London 
1. If arandom variable x has a (central) y? distribution with v — of freedom then 
the probability density function of «x is 
a-1 e-te 


p(x) = Ty) 


(x > 0). 
The probability that x is less than a fixed value X (> 0) is 


x X a-dv—l ete da 
Pr {x < X} -{, p(x) da -{, PT Gy) 


If v is even, then integrating by parts we finally obtain 


ae 
Prix < X}= 1-e#*{1 ee ee 











afar axyen 
=) py t Gti) | 


= Pr {é > 47}, (1) 


where £ is a Poisson variable with expected value 4X.* This relationship between the 
Poisson and yx? distributions has proved of use in several cases (e.g. table 7 of Pearson & 
Hartley (1954) and particularly Cox (1955)). 

In this paper an interesting extension, connecting the non-central y* distribution with 
the distribution of the difference between two independent Poisson variables, is discussed. 
A similar relationship is found between non-central F and the difference between a negative 
binomial variable and an independent Poisson variable. 


2. Let z= > (u;+a;)*, (2) 


where 1, Us, ...,¥, are v independent unit Normal variables and ay, do, ...,a, are constants 


with ¥ a? = A. Then the probability density function of 2’ is 





i=1 
© et (4A)i a/bti-1 ete’ 


pte) = BE area >? : 





This distribution of 2’ is called the non-central y* distribution with v degrees of freedom, | 


* £ is used here (and £’ later) to represent a random variable. This is contrary to usual practice, 
but the connexion between £ and its mean }X is similar to that between / (introduced later) and its 
mean 4A, 
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and non-centrality parameter A. If v is even, then, integrating with respect to x’, and using 
repeatedly the procedure by which (1) was obtained, we find 


Son eB PaeH . QxyHe 


Pr{x’ < X}= yPrl = j}Pr{é > +3}, 


N. L. JoHNson 





(3) 


This can be written 


where is, as in §1, a Poisson variable with expected value }X, and / is a Poisson variable 
with expected value 4A. If we further suppose that £ and / are independent, (3) is equi- 


we © Pr{a’ < X} = Pr{€-I> pr}. (4) 

The relation (4) is new, so far as the author’s knowledge goes, though Patnaik (1949) 
points out that Pr {x’ < X} can be expressed as ‘a double Poisson sum’. 

3. If either side of (4) could be approximated it would, of course, provide an approxi- 
mation to the other side. 

It can be shown in straightforward fashion that if k > 0 


Pr{E—1 =k} = (X/ayee-cren 5 (BVLAADE 
j=0 





jk+j 
= (X/A) e+ T.(4 {XA}, (5) 


where J,,(t) is the Bessel function, with imaginary argument, of order k (cf. Irwin, 1937; 
Skeliam, 1946). (For k < 0 the only ‘alteration’ in the formula is the substitution of —k 
for k as the order of the Bessel function. However, in evaluating (4), negative values of 
k cannot arise.) 

With adequate tables the sums of terms such as those on the right-hand side of (5) 
conld clearly be evaluated. However, Patnaik (1949) has shown that a simple (central) .? 
approximation to the non-central x? distribution gives quite good results. 

It appears so far, therefore, that it is doubtful whether (4) helps us approximate non- 
central x? integrals: it seems more likely that the converse process of approximating dis- 
tributions of differences of independent Poisson variables by non-central y? approximations 
will be useful. However, if neither 4X nor 4A is too small, a Normal approximation (expected 
value $(X —A), standard deviation ./[4(X + A)]) to the distribution of —/ should be justi- 
fiable. Using the natural continuity correction this leads to 


: Ll (ea eniveectean 

Pr{x’ < X}= 7055. e- 8?" dx. (6) 
Formula (6) has been obtained by assuming v is even, but considerations of continuity 
indicate that it will be also a good approximation when » is odd. 

It is to be expected that the accuracy of (6) will increase with A (and also with X). Pat- 
naik’s first non-central y? approximation, on the other hand, is more accurate for small 
values of A, i.e. when departure from central x? form is not great. Hence one may hope 
the two approximations will be complementary, Patnaik’s approximation being used to 
approximate the distribution of the difference of two independent Poisson variables when 
A is small, and (6) used conversely to approximate to the non-central y? distribution when 
Ais larger. The relation (4) is thus used to provide approximations for each of the two dis- 
tributions concerned. 
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4. The numerical comparisons in this section, while not extensive, should be sufficient 
to indicate the usefulness of approximations based on (6). In so far as extensive tables 
of I(t) are or become available, especially for greater values of k, exact probabilities, for 
either of the two distributions, could be evaluated exactly with some facility. Such tables 
might have other applications, e.g. in considering differences between experimental and 
control (background) counts. Such tables may indeed already be in existence and used 
for such purposes. 

Approximate upper and lower 100a % points for the non-central x? distribution may be 
obtained by solving 





V[2(X +A)] > 
l co 
Bess 224 hz? = 
for X, where Jan) [ie dx = a. 


Table 1 shows (i) some exact values of upper and lower 5 % points taken from Patnaik’s 
table 2, (ii) the errors in Patnaik’s y” approximation, (iii) the errors involved in determining 
X from (7) above and the results ((iv) and (v)) of two other approximations referred to below. 
In our Table 2a direct probabilities calculated from (6) are compared with exact values 
from Patnaik’s table 6. 


Table 1. Upper and lower 5°, points of the non-central y* distribution 


((i) Exact values. Errors (approx. — true) in: (ii) Patnaik’s y? approximation; (iii) approximation (7); 
(iv) approximation (7)’; (v) Pearson and Merrington’s approximation.) 





| Upper 5% Lower 5% 


(ii) | 





| 
(vy) | @ | 
| 








—0-04 | 0-17 | 0-02 * 


| $64 —0-01 | 0-92 | 0-57 . 
-0-06 | 065 | 029 | -0-43 | -0-28 | -0-12 


14-64 0-08 | 0-55 | 0-32 


(16 | 0-16 | —0-03 | 632 | 0-57 | —-0-25 | -O14 | —0-02 
| 25 | 45:31 | 0-35 | 0-23 | 0-13 | —0-03 | 12-08 | 0-60 | —0-21 | —0-12 | —0-01 | 
| | | 
4/ 1] 11-71 0-01 | 0-88 | 0-50 | —0-02 | 0-91 | 0-02 | —0-07 | —0-06 | —0-:03 | 
4 | 17-31 0-07 | 0:57 | 0:33 | —0-04 | 1-77 0-18 | —0-24 | —0-17 | —0-06 
16 | 35-43 0-26 | 0:30 | 0-17 | —0-03 | 7:88 | 0-48 | —0-20 | —0-13 | —0-02 
| 25 | 47-61 | 0-33 | 0-25 | O14 | —0-02 | 13-73 | 053 | -—0-17 | -O11 | —0-01 
| | | } 
7 | 1 | 1600) 0-01 | 0-83 | O47 | -0-01 | 2-49 | 0-02 | 010 | 0:02 | 0-00 
4 | 0-34 





bo 
— 
to 
3c) 
2 
So 
aK 
> 
on 
© 


| -002 | 366 | 0-12 | —0-07 | —0-06 | —0-02 
16 | 38-97 0-19 0-33 0-18 | —0-02 | 1026 | 038 | -015 | -0-09 | -0-01 





51-06 | 0-28 0-26 | 0-14 | —0-02 | 16-23 | 0-45 | —0-14 — 0-09 —0-01 | 
| | | | 


* Formulae (7), (7)’ give negative values in this case. 














The relative accuracy of approximation (6) and (7), and Patnaik’s y? approximation (in 
Table 1a), varies with v, A and the limit of integration, X, roughly, but not quite, as anti- 
cipated in §3. For small A, Patnaik’s is much the better of the two approximations. As A 
increases the position becomes reversed. But the cross-over point (of about equal accuracy) 
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is at a considerably lower value of A for the lower 5% point than for the upper 5 % point. 
This is rather unexpected, though the proportionate accuracy of (7) is greater at the upper 
5% point as would be expected. (The results for vy = 7, lower 5% points, are rather 
anomalous, but the differences vary systematically, giving no indication of incorrect values.) 


Table 2a. Probability that non-central x? exceeds upper 5%, point 
of corresponding central x? distribution 


((i) Exact values (from Patnaik); (ii) approximation (6).) 





























| | | | 
A 2 | 4 | 8 | 12 20 
|» | | | | | | | 
(i) | (ii) ; ® | @ | (i) | Gi) | i | @ | @ (ii) 
| } | | | 
| | | | 
2 0-234 0-226 | 0-416 0-412 | 0-719 | 0-715 | 0-885 | 0-878 0-983 0-981 
3 0-195 0-193 | 0-357 | 0-352 | 0-665 | 0-651 0-841 | 0-837 0-974 0-971 
4 | 0-171 | 0-175 | 0-320 | 0-316 | 0-605 | 0-601 0-803 | 0-799 | 0-963 | 0-961 
6 0-146 | 0-152 | 0-276 | 0-266 | 0-531 | 0-526 0-738 | 0-735 0-940 | 0-938 
8 0-131 | 0°136 | 0-238 | 0-235 | 0-477 | 0-472 | 0-685 | 0-681 | 0-916 0-914 
12 0-113 0-115 0-198 | 0-197 | 0-402 0-395 | 0-601 | 0-596 | 0-866 | 0-866 
20 0-096 0-101 | 0-158 | 0-159 | 0-315 0:310 | 0-489 | 0-483 | 0-775 | 0-775 
| | | | 
| | | | 








Table 2b. Comparison of approximations to the non-central x? probability integral 


((i) Exact values (from Patnaik); (ii) Patnaik’s Normal approximation; (iii) approximation (6); 
(iv) approximation (6) modified by (6)’; (v) Pearson and Merrington’s approximation.) 























x » r (i) | (ii) | (iii) | (iv) (v) 
| | | | 
30 16 | «32 0-0609 | 0:0638  0-0634 | 0-0632 0-0628 
60 |» 16 | 32 03316 | 0:8320 | 0-8310 08308 | 0-8320 
36 | 24 6| lf 0-1567 O-1515 | 01577 | 0-1577 0-1563 
| 72 | 2 | 2 | 0-9667 0-9686 | 09644 | 00-9665 0-9667 
| | | | 





5. Patnaik also gives a Normal approximation, which can be obtained by applying the 
tantionship (2x?) —4/(2v — 1) = unit Normal variable 
to his y? approximation. This leads to 

1 ft 
, = As —ta2 
Pr {a’ < X} Jan) [ie dic 


¢ 2 
with pa [PX@+A_ [(2+ar_ 1). 
v+2A 


v+2r0_ (8) 


A few numerical comparisons of (6) and (8), using values in Patnaik’s table 3, are shown 
in Table 26. The two approximations appear to be of about equal accuracy, so far as is 
indicated by these four comparisons. 
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The limit of the integral in (6) is of the form . e 

ot X3 

o_O (9) and , 

V(BX +C) as a 

while in (8) the limit is of form DJX—-E, 10) a 
ui 

where A, B, C, D, E are constants depending on v and A. (9) is equal to zero when of £- 
distri 


X=A=A+4?r-l1, 














(10) equals zero when obtai 
X = (E/D)? = A+v—}(v4+2A)/(v+A). 
The latter values lies between A+v—} and A+v—1; hence the differences between the This; 
median non-central x? given by the two approximations must be less than }, and approxi- is rat 
mation (7) will always give a smaller median value than Patnaik’s Normal approximation. of X 
The forms of expressions (9) and (10) indicate that good agreement between the two so ob 
approximations cannot be expected over the whole range of values of X. When X is large, Colur 
for example, the ratio of (10) to (9) is approximately Col 
— [(2v+A)_ 4) _ 2, | v+a : respe 
pyB= |(2as x2) =2 rae> a2 sng 
Hence for upper percentage points we would expect (6) (or equivalently (7)) to give values te 
of X considerably larger than those given by Patnaik’s Normal approximation. Conversely, aie 
the latter should give larger values for Pr {x’ < X}. Table 26 indicates that this is the case, 
although the differences in values of Pr {x’ < X} are not remarkably large. This may be 
because the values of X involved are not particularly large. . rn 
6. Equation (4) is exact. The accuracy of approximation (6) is essentially the accuracy “ee 
of a Normal approximation to the distribution of the difference between two independent it 
Poisson variables. This accuracy should be rather greater than that of a Normal approxi- 
mation to either Poisson distribution separately, especially if X = A. The ,/f, skewness 
coefficient of the distribution of £ —1 is 
/2(X—A) ae 
“(a y " vi 
x’ anc 
A natural improvement to the Normal approximation may be obtained by adding the next of v, 1 
term in the Gram—Charlier expansion, viz. 
1 Za (X-A-v+l1?_, 1 po | aay | (6y 
32(X+aEL  2(X+A) Jam **P 4(X +A) where 
= the P 


to the right-hand side of (6). The results of adding this correction to the values in column (iii) 
of Table 26 are shown in column (iv) of this table. 

Another method of seeking an improvement on (6) is to approximate to each Poisson 
variable by a multiple of a x? variable such that the two variables have the same mean and 
standard deviation. £, 1 are then replaced by 4y%, $x%, where x, x% denote central * 
variables with X, A degrees of freedom respectively. 

From (4), we then obtain, using the continuity correction 


Pr{a’ < X} = Pr(xi—x} > v— 1}. (11) 








ext 


(11) 
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In general this result would not be useful as a basis for computation, since the distribution 
of yx — xj is not simple to handle. If, however, one (preferably both) of the quantities X 
and A are even integers, then the probability density function of y% — y% can be obtained 
as a simple finite sum. If min(X,A) = 2 there is only one term in the expansion, if 
min (X,A) = 4 there are two terms, and so on. 

Just as approximation (6) was modified by (6)’ to allow for the skewness of distribution 
of £—1, we can also modify equation (7) for the percentage points of the non-central y* 
distribution. Using the initial terms in the inverse Cornish—Fisher (1937) expansion we 
_— (v—1)(X+A)—u, J2(X +A)! 


atl X+A+huzZ-1) 0 "7 








This is a quartic equation in X. vhich could be solved by an iterative method. Computation 
is rather simpler, however, if (/)’ is used to obtain the value of X — A and hence the values 
of X and A, corresponding to a given value of X +A. From the table of values of X and A 
so obtained the value of X, corresponding to a given A, can be found by interpolation. 
Column (iv) of Table 1 gives the errors in values of X obtained by this method. 

Columns (v) of Tables 1 and 26 give errors in values of X, and computed probabilities, 
respectively, obtained from a new approximation due to E. 8. Pearson and Maxine Mer- 
rington, which is described in a Note at the end of this paper. 


7. We now consider the non-central F distribution. This is defined as the distribution 
of the non-central F-variable 


,  (non-central y? with v degrees of freedom)/v 
ee x g 
(central y? with vy degrees of freedom)/y, ’ 





the numerator and denominator being independent. The denominator will be written, for 
brevity, as v/v). Using (4), the conditional probability that F’ does not exceed X/v, given the 


value of v, is 
Pr{F’ < X/v|v} = Pr{a’ < Xv/v} 


. Pr {é,— 2 3} (12) 


(if vy is even), where &, is a Poisson variable, independent of /, with expected value 4Xv/vp. 
x and J have the same meaning as in earlier sections. Averaging (12) over the distribution 
of v, we find the unconditional probability 


Pr{F’ < X/v} = Pr{é’—1 > 4}, (13) 


where £’ is a random variable, independent of /, with distribution obtained by averaging 
the Poisson distribution of &, over the distribution of v. Thus 


7 M4 1 — (3 v/v yj 
P Be pb te ees $v—-1 te (2Xt 0! ok Xvlr9 dy 
r{g’ = j} 24oT(4y,) [, v e j e odv 





s 1 (X/2ro)! I pioti—-1 e—tv(1+-X/r9) dv 
2oI(dy) J! 0 


ABSA OA" oer. 
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so that &’ is a negative binomial variable with integral probabilities given by terms in the 
expansion of [(1-+X]v) —X] vp}. 


Equation (12) implies therefore that the distribution of non-central F can be expressed in 
terms of the distribution of the difference between a negative binomial variable (£’) and 
an independent Poisson variable (/). 

The expected value and variance of £’ are 


xj. x x 
and bos. (: +=) = 4x (1 +=) 


0 
respectively. Hence the analogue of approximation (6) is 
1 (X —A—v+1)/V[2(X+A+ X*/r%)] 
Jer) } 
This differs from (6) only in the term X?/y). Table 3 gives approximate values calculated 
using (14) compared with exact values and values of an approximation, using the central 
F distribution, obtained by Patnaik, taken from Patnaik’s table 7. Table 3 also gives some 


further approximations analogous to those obtained for non-central x? by the modification 
(6)’ of (6). The corrective term to be added in the present case is 


1 X(1+2X/vp) (1+ X/v)—A PT (X—A-—vt+1P (X —A—v+1) 
3/2  (X+A+X2/v,)8 2(X +A+X2/v9) 1 | 5 (2m) °* ap| area | 
(14) 

Approximation (14), for the integral of the non-central F distribution, is not generally as 
accurate as Patnaik’s central F approximation. This is in contrast to the situation for 
non-central y*, and may correspond to the lower accuracy of the Normal approximation 
to the negative binomial distribution of £’ compared with the Poisson distribution of £. 
The fall in accuracy of (14) as X increases is to be expected since the skewness of the negative 
binomial is greater than that of the Poisson distribution when X is large. 

The exact distribution of £’—/ can be expressed in terms of the confluent hypergeometric 
function Mc) & Ta+s) 


M (a,c, = Ta) 2 Tetsil (15) 


Pr{F’ < X/v} = et" da, (14) 











We have for k > } 
Ali V(dvot+kh+j) (X\** X\-t0-k-i 
ir -i<% eh (BAY 0 ( (2 = 
ne j= 2 j! Thy) T(R+5+ 7 Vo * Vo 





= te ‘)" (1 +t) Pe a Myth, b+ 1 EXAN(X+%)). (16) 
P'(}v) k! Vo 

Direct evaluation of (16) would be facilitated by appropriate tables of the confluent 
hypergeometric function. The tables of Rushton & Lang (1954) might be useful, though 
some recurrence relations would be needed to cover the larger values of k. 

For the inverse problem of calculating significance limits for F’, we can use an equation 
similar to (7). It is interesting to note that the quadratic equation (in X) derived from (7) 
can be changed into the corresponding equation (in X = vx (significance limit of F’)) 
simply by reducing the coefficient of X? by 2u2/vp. 
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8. In this paper it has been shown that the relationship between central y? and Poisson 
variables can be extended in two ways. The cumulative distribution function of a non- 
central y” can be expressed in the form of the upper tail of the distribution of 

(a) (Poisson variable)—(Poisson variable), provided the number (v) of degrees of freedom 
is even. 

Also the cumulative distribution function of a non-central F can be expressed as the 
upper tail of the distribution of 

(b) (Negative binomial variable)—(Poisson variable), again provided the number (v) of 
degrees of freedom of the non-central x? in the numerator is even. 


Table 3. Values of the non-central F integral, Pr{F’ < X/v} 


((i) Exact values; (ii) Patnaik’s central F approximation; 
(iii) approximation (14); (iv) approximation (14) modified by (14)’.) 
































| | | | 
| y | Yo | a ee ee | (iii) | (iv) 
| | \- ALE Se, | ; — 
| 
| 3 | 10 | 4 | 3-708 | 0-745 | 0-752 | 0-755 0-736 
| | | 4 | 6552 | 0918 | 0919 | 0889 | 0-901 
| 16 3-708 | 0206 | 0203 | 0-219 0-213 
| | | | 
| | 16 | 6552 | 0-517 0-520 | 0-544 0-517 
| | | | 
| 3 | 2 | «4 | 3098 | 0-700 0-706 | O711 | 0-694 
a ol 4-938 | 0-887 | 0-889 | 0-873 | 0-879 
| 16 3-098 | 0126 | O-119 | 0-129 0-130 
| | 16 | 4938 | 0-347 | 0-350 | 0-364 0-350 
| | | 
5 10 6 3-326 | 0-731 0-731 | 0-746 | 0-726 
| | 6 5-636 | 0-914 0-913 | 0-886 | 0-903 
| 24 3-326 | 0-158 | 0-157 | 0-165 | 0-165 
| | 24 5-636 | 0-461 | 0463 | 0-504 | 0-461 
| 
| | | | 
5 | 20 | 6 | 2711 | 0664 | 0665 | 0-680 | 0-660 
6 | 4103 0-870 | 0-869 0-859 0-862 
24 2711 | 0-069 | 0-064 0-068 | 0-072 
| 24 4108 | 0-245 | 0-244 0-257 | 0-249 
| | | 
8 10 | 9 3072 | 0714 | O-7I15 0-743 | 0-718 
| 9 | 5-057 | 0-908 | 0-909 0-882 | 0-893 
| 36 | 38072 | O119 | O-117 0-118 | 0-125 
36 | 5-057 | 0-408 | 0-409 0-451 | 0-409 
8 30 | 9 | 2-226 0-578 | 0-581 0-596 | 0-576 
| | 9 | 3173 0-813 | 0-815 0-813 | 0-809 
| | 36 | 2-226 0-017 | 0-014 0-015 | 0-017 
| 36 | 3173 | 0-088 | 0-085 0-086 | 0-090 
| | | | 








For completeness it is interesting to inquire whether the upper tails of the distributions of 

(c) (Poisson variable)—(Negative binomial variable), 

(d) (Negative binomial variable)-(Negative binomial variable) correspond to the 
cumulative distribution functions of any continuous variables of well-known types. 
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The second variable in (a) or (b) can be changed into a negative binomial variable by 
replacing the fixed expected value, $A, by a random variable with the distribution of a 
multiple of (central) y?. In the special case when this x? has the same number (v) of degrees 
of freedom as the original non-central .? this would correspond to a component of variance 
analysis of variance model (e.g. Johnson, 1948) as opposed to the ‘parametric’ model repre- 
sented by a fixed A. In this situation the non-central y? becomes a central x? with the same 
number (v) of degrees of freedom. 

Hence, if v is even, (c) is then associated with a central y* distribution with v degrees of 
freedom. This is, however, only a special case of (c) and, even in this case, the original 
formula (1) provides a more useful result. When $A is a multiple of a central x? with arbitrary 
degrees of freedom (v’) there seems to be no simple continuous distribution associated with 
(c). But if v’ < vy we might regard our modified non-central y?, x’, as arising from (2) in a 
mixed type of model, for which the a,’s are random variables but are not linearly indepen- 
dent. Then (c) would provide (for even v) the discontinuous counterpart. 

Similarly, (d) can be associated, in a special case, with a central F-distribution with an 
even number (v) of degrees of freedom in the numerator. Again, this is only a special case 


of (d), and the relation Pr{F’ < X/v} = Pr{é’ > py} 

obtained by putting A = 0 in (b), would be preferable for most purposes. The remarks about 
possible interpretations of the general form of (c) in terms of a mixed form of model apply 
to (d) also. 

9. Finally we repeat the procedure used in obtaining (12), but replacing the central yj, 
by a non-central y?, with vy degrees of freedom and non-centrality parameter Ay, which we 
will denote by 4;?. 

Since the probability density function of x/? can be written 


a 
vot2j=Xyg ? 


‘ = Ao)! 
P(xn) = = 6 EE oad sash 


it follows that the averaged £, is now a mixture of negative binomial variables, having the 
probability generating functions 

[(1 + X/vp) —#X v9} Hore (17) 
with probabilities e~*(4A,)//j! respectively (j = 0,1,2,...). Hence if F” = (a’/v)/(x/2/) 
represent an F-ratio based on two non-central y?’s then, if v is even, 


Pr{F” < X/v} = > ee) pr fg > 4}, (18) 
j=0 ! 


where £; is a negative binomial variable independent of 1, having probability generating 
function (17). We note that £; has the same distribution as the random variable &’ which 
appears in (13), with vp replaced by v9+2j and X replaced by X(v9+ 2j)/v. Since 

E(E5) = $X(1 + 29/9) 
and var (E/) = $X(1+X/v) (1+ 2j/»»), 
Pr {€;—1 > 4v} tends to 1 as j increases without limit. Hence, for a sufficiently large 
value of j, the remaining terms in the series in the right-hand side of (18) can be evaluated 


as a Poisson sum. A possibly useful way of evaluating (18) approximately would be to 
regard the right-hand side as a formula for the expected value of Pr {g;—1 > 4v} when j has 
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a Poisson distribution with mean 3A). The first term in an evaluation of this expected value 
by the method of statistical differentials, for example, would be Pr {é” —1 > 4v}, where 
£" is a Poisson random variable with mean 4A,; usually, of course, further terms would be 


needed. 


10. We had, in the last paragraph of § 9, a ‘mixture’ of negative binomial variables (see 
Robbins & Pitman (1949) for formal use of the term ‘mixture’). The concept of a ‘mixture’ 
of variables has a number of other applications in the present context. We can regard the 
non-central y? distribution as a mixture of central x? distributions, the negative binomial 
distribution as a mixture (with continuous weighting) of Poisson distributions and the 
non-central G-ratio distribution as a mixture of central G distributions. Here, central 
G,,, is defined as the ratio of a central x? to an independent central x?,; for non-central G 
the numerator becomes a non-central x?, y;?, say. A reasonable type of approximation to 
investigate is that obtained by using a mixture containing only two of the component 
distributions. This is a natural first extension of single component approximations of the 
type used by Patnaik for non-central x? and non-central F. 

We will find it convenient to use the following lemma (Jones, 1933). 

Lemma. If cz{+(1—c) 23 =k, (r = 1, 2,3) then z,, 2. must be roots of the equation 


(ky — k?) 2? — (kg — ky ky) z+ hy kg —k = 0. 
This result follows from the equations 
k,—k? = e(1 —c) (2, —2,)?, 
k,—kyk, = c(1—c) (2 +2) (2, — 22)”, 


ky ks —k3 = (1 —c) 2,29(2, — 2). 
(i) Non-central y* 
If y;? be a non-central x? with v degrees of freedom and non-centrality parameter A, 
_ P(X) = VA, W2) = 202A) (VAY 
Hx(Xy7) = 8(v + 3A) + 6(v + 2A) (v+A)+(V+A)>. 


If y be a mixture of two central y’s with v,, v. degrees of freedom respectively, in the ratio 
c:(1—c), then, equating the first three moments of y and x’? we find 


cy, +(1l—c) vy, = v+A, 

cv? + (1—c) v3 = (v+A)?+2A, 

evi +(1—c) 3 = (v+A)®+ 6A(V+A)4+4A. 
Using the Lemma, »,, v, are roots of ' 


22—2(v+A+1)z+(v+A)?+2v = 0, 


so vy = V+A414+,(2A4+ 1), 
Vg = V+A+1—,(2A4+1), 
and c = #(1—(2A+1)-4). 


Biom. 46 
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We can approximate similarly to the distribution of a; x7, +42X%,- We obtain equations 

cv; +(l—c)», = W, 

cv? +(l—c) v2 = W7+2Y, 

ev3+(1—c) v3 = W?+6WY +4Z, 
where W =a,w,+4,W,, Y =a,(a,—1)w,+a,(a,—1) we, 

Z = a,(a,—1) (2a, — 1) wy + ag(ag— 1) (24g — 1) wy. 
We find vy, =(Z+WY+.[(Z22+ Y3))/Y, vo =(Z+WY-,/[Z*+ Y%))/Y, 
c = ${1—Z(Z2+ Y%)-4}. 


it will usually be convenient to arrange that the variable is ‘standardized’ by making 
a,+a, = 1. 
This result can be used in approximating to the distribution of modified f, 


i = Wal (4, Wy + Ay2) 
V(X, + gXi,)’ 
where u is a unit Normal variable independent of the y*’s. Modified ¢ is encountered in 
connexion with some criteria for testing differences between means of Normal populations 


with unknown variances (Welch (1938)). Applying the approximation of this section we have 
Pr {t’ < to} = cPr{t,, < toV/[ry/(a,w, + 42~2)}} 
+(1—c) Pr{t,, < toy [P2/(a,w,+4_W2)]}- (19) 
Non-central ¢’ (wu replaced by u+6é) can similarly be expressed in terms of non-central ¢ 
integrals. 





(ii) Negative binomial 
Let y be a negative binomial variable with probability generating function 
(q—tp)™. 


Then the mixture, in the ratio c:(1—c), of two Poisson distributions with means m,, m, 
will have the same first three moments as y if 


cm, +(1—c)m, = np, 
em? + (1—c) m2 = n(n +1) p?, 
em’ + (1—c) m§ = n(n+1) (n+ 2) p?. 
m, and m, are the roots of 22 —2(n-+1)p2-+n(n+1)p* = 0, 
so m, = py(n+1)(J/[n+1]4+1), m= py[n+1)(J[n+1]-1), 
and c= (1 ~(n+ 1)-4). 
It is interesting to note that with these values of m,, m, and c, we have 
cmt + (1—c) m$ = n(n+ 1)? (n+ 4) p4 


as compared with the value n(n+ 1)(n+2)(n+3)p* which would give exact agreement 
also with the fourth moment of y. 
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(iii) Non-central F 

Two alternative approximations to the non-central F probability integral will be put 
forward here. 

The first is based on the direct approximation to the non-central G distribution using 
(i) of this section to approximate the non-central y? in the numerator. This leads immediately 


, 


to the approximation to the distribution of non-central G,, = x}7/x?, by the mixture in 
the ratio (1—(2A+1)-4): (14+ (244 1)4) 
of the central G distributions of 
Gv? Gravy Where vy =v+A+1+/(2A+1), vg =vt+A+1—J(2A+1). 
Equivalently, F’, ,, = (x;7/v)/(x3,/¥9) is approximated by a mixture of v, F, 
so that 
Pr{F,,,, < X/v} = $1 —(2A+1)-4] Pr fF, 


vand vy, Ff, , |v 


ly vel Ya,Vo 


< X/v,} 
+$[1+(2A+1)4]Pr{f,,, < X/v}. (20) 
The second approximation is based on (13). Using (ii) of this section we have 
Pr{é’—1 > 4v} + ${1—(4r+1)-4] Pr {E,-1 > Wy} + H+ (479+ 1) 4) Pr E,—-1 > gv}, (21) 
where &,, §, are Poisson variables, independent of 1, with expected values 
$X, = Xf ($941) (V[%+1]41)/%, $X, = XV/(4%0+ 1) WL4%+ U-D/% 
respectively. Proceeding further we use (4) to express (21) in terms of non-central x? 


integrals. We then use (ii) of this section to obtain an approximation to the non-central F 
integral in terms of four central y? integrals. This is 


Pr{F,,,, < X/v} + H1—(dyo + 1A — (204 I) Pry, < X,} 
+4(1+(2A+1)-4) Pr{xs, < X,}] 
+ 1+ (4%) + DAYAL (224 1) Pr fy, < X,} 
+4(1+(2A41)-4) Pr{x3, < X,}1, 
where X,, X, have the values defined above, and 
Vy = Vt+A414(2A41), ve = V+A41—A(2A41). 


REFERENCES 

CornisH, E. A. & Fisuer, R. A. (1937). Moments and cumulants in the specification of distributions. 
Rev. Inst. Int. Statist. 5, 307-22. 

Cox, D. R. (1955). Some statistical methods connected with series of events. J. R. Statist. Soc. B, 
17, 129-64. 

Irwin, J. O. (1937). The frequency distribution of the difference between two independent variables 
following the same Poisson distribution. J. R. Statist. Soc. 100, 415-16. 

Jounson, N. L. (1948). Alternative systems in the analysis of variance. Biometrika, 35, 80-7. 

Jones, H. G. (1933). A note on the n-ages method. J. Inst. Actu. 64, 318-24. 

Patnaik, P. B. (1949). The non-central y* and F distributions and their applications. Biometrika, 
36, 202-32. 

Pearson, E. S. & Hartiey, H. O. (1954). Biometrika Tables for Statisticians, 1. Cambridge University 
Press. 

Rossins, H. E. & Prrman, E. J. G. (1949). Application of the method of mixtures to quadratic forms 
in Normal variates. Ann. Math. Statist. 20, 552-60. 

Rusuton, S. & Lana, E. D. (1954). Tables of the confluent hypergeometric function. Sankhyd, 13, 
377-411. 

Sxettam, J. G. (1946). The frequency distribution of the difference between two Poisson variates 
belonging to different populations. J. R. Statist. Soc. 109, 296. 

Wetcu, B. L. (1938). The significance of the difference between two means when the population 
variances are unequal. Biometrika, 29, 350-62. 


1»¥0 


23-2 








[ 364 ] 


Note on an approximation to the distribution of non-central ? 
By E. 8. PEARSON 


Dr Johnson’s comparison on pp. 352-63 above of approximations to the probability integral of the 
non-central y? distribution suggests that it might be appropriate to put on record here another approxi- 
mation which Mrs Maxine Merrington and I tried out a little while ago. If, following Johnson, the 
non-central y? variable is denoted by x’, ther. we have 


E(x’) =v+A, o(x’) = /{2(v+2A)}, 


_ 8(v+3A)? 12(v+ 4A) (1) 


waaay? P= 3+ Gr eap 


Ay(x’) 
It follows that the /,, 2, point of the distribution must fall within a fairly narrow sector of the /,, /, 
field lying between the straight lines 


f2—28,-—3=0 (when A/y + 0), 
f.—48,-—3=0 (when v/A > 0), 


the former being the y? or Type III line. This suggests that, in shape, the distribution of x’ will never 
differ greatly from that of a y? distribution having appropriately chosen parameters. Patnaik gave the 
approximating distribution the correct start at 2’ = 0 and the correct mean and standard deviation. 
It occurred to us that except in cases where the curve rises abruptly at the lower end, i.e. when both 
v and A are small, it might be better to approximate by using a y? distribution with the correct mean, 
standard deviation and /, coefficient, but not necessarily exactly the right start. 
This latter result is achieved by taking 
, x2-Vv v+3A A? 


ae , Ri ota ee ee es 
J(2v’) ola) + E(w") p+ 2a%” v+3A’ (2) 





where y2, is distributed as a central y? having fractional degrees of freedom given by 


, 8 _ (v+2A)8 


"> Bile) ~ wea 9 


Using equations (2) and (3) and interpolating for v’ in the table of percentage points of y? (Biometrika 
Tables for Statisticians, 1, table 8) we obtained upper and lower 5% points for non-central y? with the 
errors as shown in columns (v) of Johnson’s table 1. Again, for the X, v and A values given in his table 26, 
we obtained the values of the probability integral shown in column (v) of that table; in this case, the 
Tables of the Incomplete Gamma Function were used. Taken as a whole, it is clear that for the particular 
values of the parameters selected, the three-moment x? approximation is better than the other approxi- 
mations. The computation involved is straightforward, apart from the necessity of interpolating for 
fractional degrees of freedom, v’. 

It should, however, be emphasized that a search for accurate approximations to non-central statistics 
was not the main purpose of Dr John3on’s investigation. 
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THE x? TEST FOR SMALL EXPECTATIONS IN CONTINGENCY 
TABLES, WITH SPECIAL REFERENCE TO 
ACCIDENTS AND ABSENTEEISM 


By C. A. G. NASS 
Netherlands Institute of Preventive Medicine, Leiden 


1. INTRODUCTION 


In the study of absenteeism we are often presented with a large number of small samples of 
sickness absences or accidents, specified for two or more subperiods, each sample being 
contributed by one worker, observed during the total period. We desire to know whether 
the absences, say, of individual workers can be regarded as random samples of such a 
population, given by the marginal totals, regardless of whether or not the marginal totals 
agree with the lengths of the subperiods or some other assumption about the distribution 
of absences in time. 
Thus we are presented with an (m x n)-design, with (m— 1) (nm —1) degrees of freedom: 














V1 v5 Lim | % 
(1) 

Xn vnj x nm Zn 

ane ae Se eae 


The joint distribution with which are are concerned is one that could arise, among other 
ways, if from a finite population, subdivided into m classes with y,, ..., y,, elements, samples 
of z,,...,2%,-, elements are randomly drawn without replacement, the remainder being 
considered as the nth sample with z,, elements. In our particular problem the rows of (1) 
represent workers, the columns subperiods and the letters the number of absences or 
accidents of one worker in one subperiod. 

Our test of homogeneity or independence within the contingency table will be based on 


the statistic =a ee 
U7 


The standard test is to set v=G4, v=(m—1)(n—-1). (3) 


It is well known that if none of the expectations of x;; are too small, then under the null 
hypothesis of independence, @ will be distributed approximately in a chi-squared dis- 
tribution with a parameter v equal to the number of degrees of freedom. When, however, 
the expectations are small the distribution of G will differ from that of x”, partly because the 
former becomes noticeably discontinuous and partly because the moments of G, beyond 
the first, may be rather different from those of °. 

With absenteeism it often happens that a number of expectations of z,;; are smaller 
than unity, because a number of workers have only one absence in the total period of 
observation. Haldane (1937) met with the same difficulty in the field of genetics. In his 
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case the rows of (1) represent litters, the columns coat-colours and the letters numbers of 


mice. He suggested using the normal distribution as an approximation to the distribution of 


G, i.e. 
u = (G—E(@))/,/var G, (4) 


where wu is assumed to be approximately normaily distributed about zero with unit standard 
deviation. However, this approximation must be poor if the coefficient of variation of G 
is not small, which occurs in the field of absenteeism as it does in genetics (Griineberg & 
Haldane, 1937). 

Vessereau (1958), in a paper restricted to the one-way classification (a special case of 
design (1)), considered the approximation of the distribution of cG by a chi-squared dis- 
tribution with parameter v, where c and v are determined such that the first two moments 
of the former distribution are equal to those of the latter, i.e. 


v=cG, c= 2H(G)/varG, v=cH(G). (5) 


He also considered the approximation of the distribution of a+G@ by a chi-squared dis- 
tribution, with suitably chosen a and v. He found that it was markedly worse than in the 
case of (5), judged by the third and fourth moments. 

It might be conceived intuitively that the approximation (5) must be the better one. If 
expectations are decreasing and the variance of G has ceased to be nearly twice its expecta- 
tion, the general shape of the distribution of G might continue for a while to resemble that 
of some chi-squared distribution and it might continue to start at only a little above zero, 
because the values of N(y;z,;)~' are not integral. Probably a still better approximation could 
be obtained by setting y? = a+cG, ensuring correct values for the first three moments. 
But the third moment of @ for the general design (1) is at present unknown and even if 
determined, would probably be too unwieldy for practical use. 

It is true that Vessereau considered the additional computations involved in determining 
the c and v of (5) to be prohibitive. However, the greater labour involved is in calculating G, 
which must be performed in any case, rather than in calculating its variance, and we have 
suggested rules for simplifying the computation. Vessereau advised the use of the classical 
test (3), even in the presence of small expectations, suggesting that the 2% significance 
levels of the standard tables should be taken as 5 % levels to allow for the greater variance, 
which was present in most of his purely numerical examples. Observing this advice, one 
could be fairly sure that the real probability of errors of the first kind would not exceed the 
nominal probability, but one would have to accept that the errors of the second kind would 
be grossly inflated. If it were true that small expectations tend to increase the variance 
of G, the inflation would only be moderate, but such a tendency does not generally exist. In 
most of the examples given in this paper, and taken from the practice of absenteeism, the 
variance of Gis less than twice its expectation, and the same applies to the genetical examples 
of Griineberg & Haldane. Besides, the errors of the second kind are not always on the 
‘safe’ side. Which of the two kinds appears as the most dangerous depends on the particular 
problem considered. For these reasons we do not support Vessereau’s advice. 

In the present paper we shall consider methods of determining the significance of G, when 
expectations are small, based on Vessereau’s approximation (5), although it was not 
favoured by its author. In so far as the resulting value of v is beyond the range of existing 
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tables of the chi-squared distribution, we shall use R. A. Fisher’s approximation, namely, 
pees thet wu = J(2x%) —/(2v—1) = (2G) —/(2v—1) (6) 


is normally distributed about zero with unit standard deviation. Comparisons between the 
true and approximate distribution of G, although limited to a few easily calculated cases, 
do suggest that the approximation is likely to be a good one in a very wide range of situations 
likely to be met in the analysis of accident and absenteeism data. 


2. THE AVAILABLE TOOLS 


The general design (1), and some of its special cases, constitute a handy set of tools for the 
study of accidents and absenteeism. For all of these the parameters c and v are obtained in 
the same way (equation (5)), but the appropriate expressions for G, H(G) and var G, to be 
given in this section, are different. For each of the tools we will give an appreciation of its 
special merits for certain jobs. 


(i) Case of m x n-fold table with (m —1)(n—1) degrees of freedom 
This is the general design (1). Equation (2) gives the expression for G. That for Z(G) is 


_ @—D(n-)N 
sd ela oor eee (7) 
After publishing expressions for the moments of @ for several special cases, Haldane (1939) 
found an expression for the variance of G for the general design. Dawson (1954) put Hal- 


dane’s expression into the following shape, which is more suitable for computation: 





pean -8 _.@- 58-7 
wat ee Sere ee 
_ NX(zz1)—1? _ N&X(yz*) —m? 
cen pe eas ee (8) 
2N N 
then var G = Vag P — 9) (H-1) + HT: 





The merit of this tool is that it tests independence regardless of distribution in time and of 
distribution over workers. In this context ‘independence’ means that the set (,, ..., Dm) 
of probabilities of subperiods, for an arbitrary worker with z absences in the total period, 
is independent of z and of the worker. ‘Distribution in time’ means the distribution of 
absences over subperiods, regardless of the population of workers. Accordingly, this concept 
presupposes the existence of independence and it should be considered only after the data 
have successfully passed the test for independence, or if there are other strong reasons to 
take independence for granted. Under this pre-supposition the distribution in time is 
multinomial and assumptions about the values of p,, ..., p,, can be tested with the classical 
x*-test, since expectations will usually be large enough. 

The distribution of absences over workers and its inverse, the distribution of workers 
over numbers of absences, lose practical importance if they are not constant over all 
periods with N absences in total. Analysis of such distributions should also be preceded by 
the test for independence, if the data are specified for subperiods. 
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(ii) Case of m x n-fold table with m(n —1) or (m—1)n degrees of freedom 

These are limiting cases of the preceding one. In fact, if an (m+ 1)th column is added to 

the design (1), with a column total Y tending to infinity and cell-frequencies X;, such that 

the quotients X,/Y tend to fixed numbers p,, then the contribution of column (X;,) to G 

vanishes with probability one and (1) tends to a design with random row totals with, in 
the limit: 

G = Eaijpz*yz'—N, \ 

ts (9) 

E(G) = m(n—-1), varG = 2m(n—1)+(Xpz!+ 2—n?—-2n) Sy;1.J 

The equations for the case with (m—1)n degrees of freedom are symmetrical with (9). 

These results were found by Haldane (1937). In these cases independence is tested simul- 

taneously with an assumed distribution of absences over either subperiods or workers. It 

is difficult to imagine a situation in the study of absenteeism where this model would be 

appropriate. 
(iii) Case of one-way table 
If we set m = 1 in case (ii) with m(n— 1) degrees of freedom, the design reduces te a dis- 
crete one-way distribution with n cells. Dropping the subscript j, the appropriate equations 


ae G=Z(xi|(Np))-N, EG)=n-1, | 
10 
var G = 2(n —1) —(n?+.2n—2)/N + 2(Np,)2. J o 
These results were found by K. Pearson (1932). 

Usually one or more parameters must be fitted to the sample distribution to find the p,. 
To provide for this fact we suggest that Z(G) and varG, found with (10), be multiplied by 
(n—k—1)/(n—1), if k parameters are fitted. This is the approximate effect of efficient 
parameter fitting when expectations are large. That it would apply also when expectations 
are small is only an unproved guess, but it seems better than no correction at all, which 
would certainly produce an inflation of the errors of the second kind. 

It is often desirable to reserve the notation 2;, or x for the class values of the variate of 
the distribution tested. In that case the class frequencies might be denoted by n, or n,. 

If the range of the distribution is infinite, it must be curtailed somewhere. Since departure 
from expectation often depends heavily on one or two very improbable sample values, a 
curtailment which excludes all expectations smaller than a few units is apt to produce a 
substantial loss of power. The merit of case (iii) is that it allows us to postpone the curtail- 
ment until expectations are well under one unit. Some results of § 4 below suggest that one 
could go as far as to include expectations of 0-05, if the number of cells is large and if there 
are also considerable numbers of cells with somewhat higher expectations. 


(iv) Case of equal probabilities 
If we set »; = n-! in case (iii), equations (10) become: 
G = nX(z?)/N-N, E(G)=n—-1, varG = 2(n—1)(N—-1)/N. (11) 


One merit of case (iv) is that it allows a correction for continuity, namely the subtraction of 
unity from the sample value of X2?. It is seen from (11) that differences between adjacent 
values of G are proportionate to differences between corresponding values of La?. Thus the 
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proper correction should be to subtract from the sample value of 22? half the distance to the 
preceding value. If one is subtracted from a cell-frequency, say x,, and added to a smaller 
cell-frequency, say 2, then Xz? is diminished by 


at + 23 — (x — 1)? — (x, + 1)? = 2(%,—%,—1). 


It follows that the distance to the preceding value is at least 2, and that it is exactly 2 if 
there is at least one pair of cell-frequencies that differ by two units. If the cell-expectations 
are small, it is most unlikely that none of the cells will be empty and that none will contain 
just two observations, unless all frequencies are either one or zero, in which case the value 
of Xx? is the smallest possible. In fact, it appears from the exact distributions of G, shown 
in §4, with expectations 2, 1, 1, 0-5, and 0-2 respectively, that the possible values of Xa? 
proceed by steps of two units, up to a cumulative probability of at least 0-999. When 
expectations are not small, greater steps might occasionally be located in the critical part 
of the range, but then the number of possible values of G is large and the effect of dis- 
continuity is small. It seems, therefore, that the subtraction of one unit from the sample 
value of 2? is an effective correction where a correction is most needed. 

It follows from equations (5) and (11) that, with correction for continuity: 

2_1)— N2 ‘a 
2 = = WV , p= et) (12) 

Thus another merit of the case {iv) test is that the computations involved are very simple. 
However, the most important merit is that it may be considered as an adaptation to small 
expectations of Fisher’s test for Poisson distributions. 

If we wish to test whether the observations 2,, %, ...,,, are independent Poisson variates 
having a common expectation, the standard test, as first given by R. A. Fisher (1925), is 
to calculate 





This is the G of (11), the distribution of this criterion under the condition =x; = N being the 
same as the distribution of G considered above. 

It is well known that Fisher’s criterion discriminates powerfully between a Poisson dis- 
tribution and a distribution with a variance noticeably greater than the mean, but poorly 
if the variance and the mean of the alternative distribution are also equal. Therefore, if the 
data have successfully passed the test of case (iv), it may not be superfluous to try the test 
of case (iii), fitting a Poisson distribution to the data and comparing the observed and 
expected cell-frequencies. 


(v) Case of n elements distributed over n cells with equal probabilities 


If we set N = n in case (iv), equations (12) become 


n(Xa? —n — 1) 
Bi Se =. 13 
x iTeo lve (13) 
This gives a test particularly suited for testing an assumed even distribution of points on 
an interval, if we divide the interval into as many parts of equal length as there are points. 
It sometimes happens that the end-points of the interval of observation of a sequence of, 


say, accidents are not recorded. Then take the times of the first and the last accident as the 
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end-points and for the rest, disregard these two accidents. In fact, if m+ 2 points are ran- 
domly and evenly distributed on an interval (a,b), then under the condition that the 
positions of the extreme points are a’ and b’, the n other points are randomly and evenly 
distributed on the interval (a’, b’). 


3. METHODS OF COMPUTATION 


The computations involved in case (i) of the preceding section are so extensive that it is a 
practical necessity to arrange them in some economical scheme. In the applications which 
we have in mind the number of rows (n) will usually be much greater than the number of 
columns (m) and most of the row totals (z;) will have small integral values. Thus the quan- 
tity Xy;! may be computed directly from the y;, but for the computation of Xz; 1 the z; 
should then first be arranged in a frequency table. Ifn, denotes the number of rows (workers) 
with z; = z, then Ser! = Ea,z-}. 


It is economical to combine these computations with those for G. We shall give two schemes, 
one for m > 2 and one for m = 2. 


Scheme for m > 2 


(1) Write the rows of design (1) on separate strips or on the upper margin of cards. 
(2) For each j, sort the strips according to x,;. Note the frequency distribution of 2;,, 
its number, n, and its sum, y;. Pool the frequency distributions for all j and note the sum, NV 
and the sum of squares, Q. 

(3) Sort the strips according to the row total, z. Prepare a table with as many rows as 


there are different values of z and with 2m +5 columns, under the headings, 


(2) (2-2) (m,) ... (85) --. (8) «++ (Q) «(Ws 


for each z writing under (n,) the number of strips, under (s;) the partial sum of the 2;;, 
under (s) the sum of the s;, under (q;) the partial sum of squares of the x;;, under (q) the sum 
of the q;. 

(4) Add the columns (n,), ..., (;), ..., (8) and (q) and check whether the sums are equal to 
N, ...,Y;,---, N and Q, already computed under step (2). Write under the (q;) the product-sums 


ij? 


Q; = (2") x (q). 
Check whether xQ; = (z") x (q)_ (written under (q)). 


Write under (z~') the product-sum 
2X2; = (2-1) x (n,). 


(5) Compute the sum Ly;! directly from the y; and the product-sum 
re = (yj *) x (Q)). 
1, 


(6) Compute G, H(G), var G, c, v and y*, using the equations (2), (7), (8) and (5). 

It will be observed that the greater part of these computations can be done by 
mechanical operation, the squares of the x,; being reproduced automatically, using a re- 
stricted number of master cards. The procedure is illustrated on a numerical example in § 5. 
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Scheme for m = 2 

If there are only two columns, no separate strips are needed and the following simpler 
procedure may be followed: 

(1) Prepare, by simple scoring, a contingency table of the rows of design (1), with the 
two cell-frequencies as entries. Denote the different values of the z;, by x and those of the 
tj. by z—ax. The frequency distributions of x and z—z are found by addition of rows and 
columns. Note the total number n and the sums y, and y, of x and z—z. 

(2) Prepare a table with as many rows as there are different values of z and with five 


columns, headed. (z) (2-4) (n,) (x) (x2). 


For every z, write under (n,), (x) and (xz?) the total, the sum of x and the sum of x? over the 
diagonal of the contingency table with the corresponding value of z. 

(3) Add the columns (n,) and (x) and check whether the sums are equal to » and y, 
respectively. Check whether 

N = (z)x(n,) = y¥,+Y_ (written under (z)). 
Write under (z~') and (a?) the product-sums 
L2z) = (z-")x(n,) and @ = (z-)~x (2%). 
(4) For the computation of G use the equation 
G = N(Q(yr*+y2")—yy2"), 

which is more convenient here than equation (2). Compute H(@), varG, c, v and x? with 


equations (7), (8) and (5). 
This simpler procedure is also illustrated in § 5. 


4. SOME TEST CASES FOR THE ACCURACY OF THE APPROXIMATION 


For designs restricted to two columns (case (i) of §2 with m = 2), and row totals, z;, not 
greater than two, the computation of exact distributions for G is not unduly heavy. Since 
the approximation presumably improves for m > 2 and larger row totals, such designs are 
particularly suited for test cases. Let n be the number of rows, », of which have row total 1, 
the remaining , having row total 2, then 


N+ 2n, = yy, +Y¥_ = N. 


Of the cell-frequencies in column 1 with row total 2, let a be the number of those equal to 1 
and 6 the number of those equal to 2. Then the numbers of the five possible types of row are: 





Cell-frequencies Number of rows 
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The probability of a particular random sample is 

Y1! Yo! 24/N! 
and the number of possible samples with given a and b is 


n,! Ny! 
(y, —a—2b)! (ny —y, +a+ 2b)! a! b! (n,—a—b)!" 





Thus the probability of a is 


Y1! Yo! Ny! ng! 2° 1 


P(a) = W! p 6! (my —a—6)! (yy —@ — 26)! (ny yy +a + 26)!" 





(14) 


The criterion G is found to depend on a only, since 











,_%, efi 1 4b A(ng—a—b) y,—a—2b m—-y,+a+2b_ , aN 
a ae eg ere a 
459%, 21 Yo) “YN Yo YW Y2 “Yi ¥2 
It follows by equation (2) that ait 
G= v(t - ). 
24142 


Since a proceeds by steps of one, the value corrected for continuity is 


oe). (15) 


@=n(i-© 
441 Ye 


Using equation (14), the exact distribution of a was computed for two designs of the type 
shown in Table 1. 


Table 1 





Design: y, = yg = 20, n, = 10, n, = 15, 
N = 40, n= 25 


Design: y, = 280, y. = 20, ny = n, = 100, 
N = 300, n = 200 






































{ | ] 
a P(a) a P(a) a P(a) a | P(a) a | P(a) a P(a) 
{ 
| 
| | 
| 0 | 0-00002 6 | 0-14071 | 12 | 0-01825 | 3 | 0-00002 9 | 0-05062 | 15 | 0-09893 
1 | 0-00035 7 | 0-18906 | 13 | 0-00464 | 4 | 0-00011 | 10 | 0-09289 | 16 | 0-05102 
2 | 0-00255 8 | 0-19840 | 14 | 0-00075 | 5 | 0-00060 | 11 | 0-14070 | 17 | 0-01975 
3 | 0-01142 9 | 0-16285 | 15 | 0-00006 | 6 | 0-00252 | 12 | 0-17585 | 18 | 0-00541 
4 | 0-03550 | 10 | 0-10378 7 | 0-00839 | 13 | 0-17997 | 19 | 0-00093 
5 | 0-08108 | 11 | 0-05054 8 | 0-02276 | 14 | 0-14939 | 20 | 0-00007 
| 








Total 0-99996 Total 0-99993 











Application of equations (7), (8), (5), (15) and (6) gives the results shown in Table 2. 
The comparison of exact and approximate cumulative distributions of a, which by (15) 
are equivalent to those of G, is shown in Table 3. 
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Table 2 








Design: yy = Yo = 20, n, = 10, 
N,= 15, N= 40, n= 25 


Design: y, = 280, y, = 20, 
N, = N, = 100, N = 300, n = 200 





E(@) from (7) 
var G from (8) 


; from (5) 


u from (15), (5) and (6) 


24-615 
15-289 
3-220 
79-26 
2-538,/(39 — 2a) — 12-591 


199-67 
300-73 
1-328 
265-2 
1-886,/(221 — 6a) — 23-01 














———_—_——— 





Table 3 





Design: yy = Y2 = 20, ny = 10, ny = 15 Design: y, = 280, y, = 20, ny = nz, = 100 





| a | Exact |Approx. 























| | | 
a | Exact | Approx. a | Exact | Approx. 

- | | | | scsioainiaaocilmsteee 

| | 
| 0 0-000 0-001 8 0-659 0-624 4 | 0-000 0-000 12 0-494 | 0-495 
; 1 0-000 0-002 9 | 0-822 0-834 5 | 0-001 0-001 13 0-674 0-676 
| 2 0-003 0-008 10 0-926 0-937 6 | 0-003 0-004 14 0-824 | 0-825 
| 3 0-014 0-024 11 | 0-976 | 0-983 7 0-012 0-013 15 0-923 0-923 
| 4 0-050 0-062 12 0-995 | 0-997 8 | 0-034 0-036 16 0-974 0-973 
| 5 0-131 0-141 13 0-999 | 1-000 9 | 0-085 | 0-086 17 0-994 0-993 
| 6 0-272 0-275 14 | 1-000 | 1 10 | 0-178 | 0-181 18 0-999 | 0-998 
7 | 0-461 0-460 | ll | 0-319 | 0-319 19 1-000 | 1-000 
} | | | 











In the first design the approximate 1 % level of significance is correct, namely at a = 2. 
The exact 5 % level is at a = 4, the exact and approximate probabilities of a being smaller 
than 5, being equal to 0-050 and 0-062 respectively. This design contains 20 cells with 
expectation 0-5. The second design contains 100 cells with an expectation of only 0-067. 
Yet the approximation is excellent for this design, the 1 and 5 % levels of significance being 
correct. Our conclusion from these comparisons is that in the field of absenteeism there 
will be little occasion for serious errors, due to poor approximation, if case (i) of § 2 is applied. 

Vessereau (1958) has given exact distributions of G in three designs with N elements 
distributed over n equi-probable cells, namely for 


N=15, n=5, N=16,n=8, N=10, n=10. 


Such designs, apart from being the only cases of one-way classifications for which the com- 
putation of the exact distribution is practically possible, are of special importance, since the 
distribution is the same as that of Fisher’s criterion, assuming that Xa; = N is fixed (case 
(iv) of §2). For our purpose, however, Vessereau’s designs are not very critical, since their 
values of N and n, if presenting numbers of absences and of workers, seem far below any- 
thing that will be asked for in practice, whereas expectations smaller than N/n = 1 will 
frequently occur. One also wants to get some idea of the speed with which the approximation 
improves with increasing n in the important case where n = N, about its behaviour if n 








374 The x? test for small expectations in contingency tables 


varies with constant N and so forth. Unfortunately one has to take account of the amount of 
computation required; in particular, in finding the exact distributions. For designs with 
n = N the work increases rapidly with NV; once it is done however for such a design, the 
additional work for designs with the same N and greater values of n is comparatively smail. 
On such considerations we decided to take Vessereau’s second and third designs and new 
designs with N = 16 and n = 16, 32 and 80 respectively. For these, we compare the exact 
distribution of G with its approximation according to (12). 

Vessereau has described the following method of finding the exact distribution of G in 
a design of N elements distributed with equal probabilities over n cells: Let a partition of N 
over 7 be defined as an ordered set of n non-negative integers x;, such that Xx; = N. Let 
Po Pi +++ Py be a sequence of non-negative integers. The union of all partitions with p, 
numbers equal to 9, ...,,. numbers equal to N will be referred to as a class. All partitions 
of a class are equi-probable, since their probability is 


n-NN! 
(2!)P2... (W!)ey" 
The number of partitions of a class is n!/(po! ... p,!) so that the probability of a class is 
n-Nn! N! 
P(p.-- Py) = 16 
(Po Py) (2! wee (N1)PN py! .. py ( ) 
Let Q be defined by Q = Xa? = p,+4p.+...+ N*py, (17) 


so that all partitions of a class have the same value of Q. Consider two designs with the same 
value of N, the first with n = N and the second with a greater value of n. To every class of 
the first design there exists a class of the second design with the same values of 9,, ..., py 
and with a value of p, that is n—N greater. Let q denote the value of p, under the first 
design, then the value of p, of the corresponding class under the second design is ¢+n—N 
and corresponding classes have the same values of g and Q. Denote the probability of the 
union of all classes with given values of g and Q under the first design of Py(q, Q) and that of 
the corresponding union under the second design by P,,(q, Q). It follows from (16) that 


ET gd cid 
Py(q, Q) = q! (2!)P2... (W!)Pvp,!... py!’ (18) 
(N/n)¥ n!q! 


Pq, Q) = NI( (q+n— N)! (9, Q). 


Since the summation over all combinations of p,, ..., py, that are compatible with the given 
values of q and Q is the most laborious part, this shows that the simultaneous distribution 
of q and Q under the second design is rather easily found if that under the first design is 
known. Since 





+d 
G=- V =a, (19) 
the distribution of @ follows directly from that of Q which in turn is found in the obvious 
way from that of q and Q. The greater part of the work was done with punched cards by the 
Mechanical Operation Section of the Institute. The exact conditional distributions of 
Fisher’s criterion were found to be as shown in Table 4. 











Th 
sine 
ap} 
(19 
bak 
whi 


16) 


17) 


me 
s of 
Pn 
irst 
~N 
the 
t of 


18) 


ven 
‘ion 
n is 


19) 


ious 
the 
3 of 

















C. A. G. Nass 


Table 4 





| 
G P(@) G P(@) @ | P@ G P(@) 


Design: N = 16, n= 16 























0 _— 16 0-13914 32 | 0-00332 48 0-00009 
2 0-00014 18 0-10746 34 | 0-00276 50 0-00006 
+ 0-00310 20 0-06998 36 0-00157 52 0-00002 
6 0-02302 22 9-04922 38 0-00094 54 0-00001 

8 0-07212 24 0-02740 40 0-00044 56 0-00001 
10 0-12829 26 0-01959 42 | 0-00020 58 0-00001 
12 0-17017 28 0-01172 44 | 0-00023 60 0-00001 
14 0-16260 30 0-00621 46 | 0-00016 62+) 0-00001 


| 
| 
| 
| 








1-00000 

















Design: N = 16, n= 32 





| | 
16 | 001040 | 44 | 0-04089 | 72 | 0-00024 | 
20 | 0-07343 | 48 | 0-02256 | 76  0-00010 | 
24 | 0-18562 | 52 | 0-00936 | 80 | 0-00010 
28 | 0-23396 | 56 | 0-00468 | 84 | 0-00006 
32 | 0-19905 | 60 | 0-00246 | 88 | 0-00002 
36 | 0-13970 | 64 | 0-00126 | 92 | 0-00001 
40 | 0-07549 | 68 | 0-00060 | 96+) 0-00001 








| 
| 
| 
| 100000 | 
| | 











Design: N = 16, n= 80 





64 0-20039 114 0-01344 164 0-00008 | 
74 0-36995 124 0-00427 174 0-00004 
84 0-25504 134 0-00193 184 0-00001 
94 0-10990 144 0-00051 194+); 0-00001 

104 0-04430 154 0-00013 


























| 
| 
1-00000 | 





The fact that the sum of the probabilities is exactly one in all three designs is a coincidence, 
since no adjustment to that effect was made. For the approximative distributions the normal 
approximation (4) was not used, since four of the values of v were too small. K. Pearson’s 
(1922) T'ables of the Incomplete T'-Function were used instead. These give cumulative pro- 
babilities I(u, p), starting from the left, of a variate w, associated with a parameter p, 
which are related to the v and the x? of equations (9) by 


p= hv-l, u=x?/(2v)h. 
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For this purpose, using the correction for continuity, equations (11) and (12) were replaced by 


_ N(n-1) 1 r NG-n 
~2N=1) 0°“ ~ J2N(m=1) (= 1) 





(20) 
Application of (20) gave the results shown in Table 5. 


Table 5 





Design pranae eee N=16,n=8 |N=16, n=16 a18, nate | 16, n=00 


ae ear Ponies Pater 
| 








p | 4 2-733 7 15-533 


41-133 
u | 0-24846(G—1) | 0-13802(2G—1) | 0-18856(G — 1) 0-13129(G — 2) 


| 0-08216(4—5) 

















The comparisons of the cumulative distributions, starting from high values of G, are shown 
in Table 6. 

Comparing the two designs with N = n, it appears that the approximation improves 
moderately, if N increases from 10 to 16. In the latter design, the 1 and 5% levels of 
significance are correct. In the former design the 5% level is correct. The approximate 
1% level is 22, whereas the exact probability of G being equal to 22 or greater is 0-014. 
The error is hardly of practical importance. It may be safely concluded that for designs with 
N =n, the approximation is good enough if N = 16 or greater. Comparing the four designs 
with N = 16, it appears that the approximation deteriorates if n increases from 8 to 80 and 
the cell-expectation decreases from 2 to 0-2. Yet even in the last design the nominal | and 
5% levels are still correct, namely at G = 124 and 114 respectively. 

In the last design there are 80 cells, each with expectation 0-2. We have seen before that 
the approximation is excellent in a design with 100 cells with expectation 0-067, 100 cells 
with expectation 0-13, 100 cells with expectation 0-93 and 100 cells with expectation 1-87. 
It seems that the disturbing influence of small expectations is effectively counteracted by 
large numbers of cells, especially if there are also considerable numbers of cells with some- 
what greater expectations. 


5. APPLICATION OF THEORY TO NUMERICAL EXAMPLES 


Example | 


The records of Farmer & Chambers (1939) of the accidents of 166 London bus-drivers over 
a period of 5 years constitute a design with 5 columns and 165 rows, since one driver, who 
had not a single accident, had to be omitted. So m = 5, n = 165. The yearly totals and the 
grand total are: 





{ 


| 
"4 | Yo } Y¥3 
| 














| 
| 
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Table 6. Cumulative distributions, starting from high values of G. 


























































































































G | Exact | Approx. | G Exact | Approx. | @ Exact | Approx. | G Exact | Approx. 
Design: N= 10, n= 10 (exact distribution taken from Vessereau) 
| paRoD od 
2} 1-000 1-000 10 | 0-433 | 0-402 18 0-039 0-042 26 0-003 | 0-003 
| 4 | 0-983 | 0-972 12 | 0-240 | 0-270 20 0-024 | 0-020 28 | 0-002 | 0-001 
6 | 0-869 | 0-852 14 | 0-157 | 0-154 22 0-014 0-010 30 0-001 0-000 
| 8 | 0-657 | 0-651 16 0-071 | 0-082 24 | 0-006 0-004 32 0 900 | — 
| 
| | 
= 
| Design: N = 16, n= 8 (exact distribution taken from Vessereau) 
| | | | | | | 
| 2] 1-000 | 1-000 9 | 0-384 | 0-379 16 | 0-038 | 0-039 23 | 0-004 | 0-003 
3 | 0-989 0-986 10 | 0-274 0-288 17 | 0-027 | 0-027 24 0-003 0-002 
4 0-935 | 0-935 11 | 0-209 0-214 18 | 0-020 | 0-018 25 0-002 0-001 
5 | 0-859 | 0-846 12 | 0-151 0-159 19 0-014 | 0-013 26 | 0-001 0-001 
6 | 0-732 | 0-731 13 | 0-112 | 0-127 20 | 0-009 | 0-008 27 0-001 | 0-000 
7 | 0-618 | 0-607 14 0-073 | 0-079 21 0-006 | 0-006 28 | 0-000 — 
| Ze 0-479 | 0-487 15 | 0-056 0-056 22 0-004 0-004 | 
| | | | 
| 
| Design: N = 16, n= 16 
| | | | | | | 
| 0! 1-000 | 1-000 12 0-603 | 0-609 24 0-047 0-047 36 | 0-002 | 0-001 
| 2} 1-000 | 0-999 14 | 0-441 0-453 26 0-028 | 0-025 38 | 0-001 | 0-000 
| 4) 0-997 | 0-994 | 16 | 0-301 | 0-316 | 28 | 0-016 | 0-014 | 40 | 0-001 | — 
| 6 | 0-974 0-963 18 | 0-194 0-208 30 0-010 | 0-007 42 | 0-001 | — 
8 | 0-902 0-887 20 | 0-124 0-131 32 | 0-007 | 0-004 44 | 0-000 — 
10 | 0-773 | 0-762 | 22 | 0-075 | 0-079 | 34 | 0-004 | 0-002 | | 
} | | | | | 
Design: N = 16.n = 32 
| 
ze | | | 
16 | 1-000 0-997 32 0-497 0-518 48 | 0-041 0-036 64 0-002 | 0-001 
20 | 0-990 0-973 36 | 0-298 | 0-322 52 | 0-019 0-014 68 | 0-001 | 0-000 
24 | 0-916 | 0-890 40 | 0-158 | 0-173 56 0-010 0-005 72 0-001 | — 
28 | 0-731 | 0-730 44 | 0-082 | 0-084 60 0-005 0-002 76 | 0-000 | — 
| | | | | | 
Design: N = 16, n= 80 
| 
| 64 1-000 | 0-966 94 | 0-175 | 0-201 124 | 0-007 | 0-002 154 | 0-000 | 0-000 
| 74 | 0-800 _ 0-789 104 | 0-065 | 0-059 134 | 0-003 | 0-000 
| 84 | 0-430 | 0-480 114 | 0-020 | 0-012 144 | 0-001 0-000 
Bi Al | | | 
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The numbers n, of the personal totals z are: 























| 
z n, z | n, z | ci z | n, 
1 2 6 | 17 ll | 9 16 6 
2 3 7 | 14 12 | 6 17 3 
3 14 8 14 13 2 19 1 
4 17 9 12 14 6 21 3 
5 21 10 13 15 1 32 1 
: | 
Total 165 











The sums of the inverses of row- and column-totals are: 
Lez] = 28-5788, Lyz! = 0-018881. 
The results of equations (7) and (8) are: 
p = 143-762, yw = 3-9880, 
o= 8121, 7 = 0-000084, 
E(G) = 656-5, varG = 1085-2. 


Using (5), the parameters c and v are: c = 1-210, v = 794-4. 

To illustrate the method of computation, recommended in § 3, for designs with more than 
two columns, the personal scores of the n, = 6 drivers with z = 12 accidents in the total 
period were as follows (first scheme, step 3): 








m bo Ol bo Go GO 


&=Lta,: | 16 


q; = =a}: 52 








| 
| 

34 | 42 | 31 | 47 | 206 
| | | 





Continuing with step 3 of the first scheme, we shall have 20 rows (corresponding to 20 
different z-values) with 2m +5 = 15 column headings. An extract from the resulting table 
is given in Table 7, showing the column ‘totals’ and the figures in that one of the 20 rows 
which corresponds to z = 12. 
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Note that only the ‘totals’ of columns 3, 4, 5, 6, 7, 8 and 9 are real totals, the other ones 
being product-sums, as described under step 4. There is no ‘total’ corresponding to 
column 1. With these results, y? is computed as follows (step 5): 


Lxi,y; iz) = 


1OPAS6 _ 13-200 80-047 73-708 73-428 





301 
= 1-5038, 


256 


G = 1330 x 0-5038 = 670-1 


x? = 1-210 x 670-1 = 810-8 
u = (1621-6 —,/1587-8 = 0-42 


268 


250 


(see equation (2)), 


(see equation (5)), 


255 


(see equation (6)), 


so that the test does not reveal a significant departure from homogeneity. 




































































Table 7 
Column 1 2 3 4 5 6 7 | 8 9 
Heading ... (2) (z-) (n,) (81) (82) (83) (8) | (8s) (8) 
| Row (z= 12) 12 | 0-0833 6 16 14 14 13 15 72 
‘Total’ — | 28-5788 | 165 | 301 | 256 | 268 | 250 | 255 | 1330 
Column 10 ll 12 13 14 15 
Heading (q,) (qe) (9s) (4) (qs) (9) 
Row (z = 12) 52 34 42 31 47 206 
‘Total’ 100-425 73-909 80-047 73-708 73-428 401-517 
Example 2 


As an illustration of the outline for a design with two columns, consider Adelstein’s data 
of accidents of 122 shunt-workers, specified for two periods of 5 and 6 years, cited from Arbous 
& Kerrich (1951). The complete records are given in the form shown in Table 8 (see 


step 1 of the second scheme of § 3). 


In agreement with the scheme, the number of accidents in the 5-year period is denoted 
by x and that of the 6-year period by z—2. If the difference between the periods makes 
this worth while, it saves some computation to take the shortest one for x, From the 8 
diagonals of this concise table the lines of the 5-column table are easily written down 
(step 2 of the scheme for m = 2) (Table 9). 


24-2 
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Table 8. Adelstein’s accident data. 
























































2-2 
2 0 l 2 3 4 5 6 Total 
0 21 18 8 2 1 : 50 
1 13 14 10 1 4 1 43 
2 4 5 4 2 1 : 1 17 | 
3 2 1 3 2 1 9 
4 : 1 1 2 
5 0 
6 : 0 
7 1 1 | 
Total | 40 39 26 8 6 2 1 122 | 
. | as 
Table 9 
we | - : ) 
(z) | (2-1) | (n,) | (x) (x?) 
| 
| | | | 
1 | — 1-0000 310 | 8 | 13 
2 | 05000 | 26 22 | 30 
3 0-3333 sid 19 | 26 | 48 
4 | 02500 | 7 | 12 | 26 
5 0:2000 | 9 in oe 
6 | 01667 | 5 i3 | 39 
7 0-1429 | 1 a ] 16 
8 | 0-1250 3 12 62 
| 
i a = | 
274 | 54-984 101 119 | 74-836 














In agreement with step 3 of the scheme, only the ‘totals’ of the columns headed (n,) and (x) 
are sums, the other ones being product-sums. Hence 


m=2, n=101, N=274, y,=119, y,=155, Ly;1=0-014855, L271 = 54-984, 
@ = 274 x (74-836 x 0-014855 — 119/155) = 94-26 (step 4), 
p = 63-370, x = 0-99634, 
o = 17-885, 7 = 0-00026, 
E(G@) = 100-37, varG = 92-99 (see equations (7) and (8)), 
c= 1-079, v= 108-3, x?= 101-7, u=—0-42 (see equations (5) and (6)). 


Again there is no significant departure from homogeneity. 
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Example 3 


We have applied the same test to the absences of the 248 shift-workers recorded by Arbous 
& Sichel (1954) and again there was no significance. The personal totals of these shift-workers 
will be used here to illustrate the test of fit of an assumed negative binomial distribution 
(case (iii) of § 2). We denote the total number of workers by n, the numbers of workers with 
z absences in the total period by n, and the expectations of these numbers by 7, = np,. 
The figures given in Table 10 are taken from Arbous & Sichel, table 1: 


Table 10. Data for absences from Arbous & Sichel. 









































| Ny Re nz" ng|Nz © Ny ay | a* nz/N, 
| | 
| en war eres METAS gee 
| 0 q 11-898 0-09 4-118 25 2 1-668 0-60 2-398 
| 1 16 16-086 0:06 15-914 26 3 1-452 0:69 6-198 
| 2 23 17-733 0-06 29-831 27 3 | 1265 | 0-79 7-115 
3 20 18-072 0:06 | 22-134 28 1 | 1-101 | 0-91 0-908 
| 4 23 17-667 0-06 29-943 29 2 0-985 | 1-04 4-175 
| 
| 5 24 16-829 0-06 34-227 30 == 0-833 1-20 as 
| 6 12 15-751 0-06 9-142 31 1 0-724 | 1:38 1-381 
7 13 14-553 0:07 11-613 32 1 0-629 | 1-59 1-590 
| 8 9 13-320 0:08 6-081 33 1 0-546 1-83 1-832 
| 9 9 12-089 0-08 6-695 34 = 0-474 | 2-11 = 
| 10 8 10-920 0-09 5-861 35 1 0-411 2-43 2-433 
| ll 10 9-808 0-10 10-195 | 36 = 0-356 2-81 — 
12 8 8-772 0-11 7-296 | 37 a 0-308 3-25 — 
13 7 7-817 0-13 6-268 38 == 0-267 3-75 = 
| 14 2 6-944 0-14 0-576 39 = 0-231 4:33 a 
| 15 12 6-153 0-16 23-404 40 “= 0-200 5-00 ts 
| 16 3 5-439 0-19 1-655 41 = 0-173 5:78 — 
17 5 4-798 0-21 5-211 42 = 0-150 6-69 = 
18 4 4-224 0-24 3-788 43 — 0-129 7-69 == 
19 2 3-713 0:27 1-077 44 — 0-112 8-96 as 
20 2 3-259 0-31 | 1-227 45 = 0-096 10-37 = 
21 5 2-857 0:35 8-750 46 — 0-083 12-01 = 
22 5 2-501 0-40 9-996 47 = 0-072 13-90 — 
23 2 2-188 0:46 1-828 48 1 0-062 16-12 16-116 
24 1 1-911 0-52 0-523 49+ — 0-386 2-59 — 
| 
| Total 248 248-015 122-18 301-499 








According to equations (10) we have: 
G = 301-499 —248 = 53-5, E(G) = 49, 
var G@ = 2x 49 — (2500 + 100 — 2)/248 + 122-18 = 209-7, 


or, with the correction suggested under case (iii) of §2, for the two estimated parameters 
of the distribution: 


E(Q) = 47x 49/49 = 47, var@ = 47 x 209-7/49 = 201. 
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It follows by the equations (5) that 
c¢ = 2x 47/201 = 0-468, v= 0-468x47 = 22-0, y? = 0-468 x 53-5 = 25-0, 


so that P = 0-3. 

There is no significant departure from the negative binomial distribution, although full 
account is taken of the fact that one worker had 48 absences. The computations would be 
less if all expectations less than five or so were grouped, but the power of the test would be 
heavily reduced. Judging from the results of § 4, we suppose that the approximation is still 
usable, although the inevitable grouping is restricted to expectations smaller than 0-06. 


Example 4 


As an illustration of the testing of an assumed Poisson distribution, according to the cases 
(iii) and (iv) of §2, we have selected the accidents of the shunt-workers in the 5-year 
subperiod (see Example 2 above), because Arbous & Kerrich (1951, p. 356) asserted that 
for each subperiod ‘the distribution of accidents was found to agree satisfactorily with the 
Poisson distribution’: 



































Data Case (iv) Case (iii) 
| 

x Ne Ny xn, Ne nz? niin, 
0 50 0 0 46-00 0-0 54:3 
1 43 43 43 44-87 0-0 41-2 
2 17 34 68 21-88 0-0 13-2 
3 9 27 81 Tl 0-1 11-4 
a | 2 8 32 1-73 0-6 2-3 
5 0 0 0 0-338 3-0 0-0 
6 0 0 0 0-055 
7 | : ; Po paene) 0-064 15-7 15-7 

| 122 119 =| ~—s(«2738 121-99 19-5 138-1 








Beginning with the test of case (iv), we find according to equations (12): 
x? = (122 x 272 —1197)/118 = 161-2, v= 119x 121/118 = 122-0. 
It follows from (6) that 
uw = /322-4—,/243 = 2-37, giving P = 0-009. 


Comparing the columns headed n,, and ,,, it will be seen that the significance is almost 
wholly due to the worker who had seven accidents. The Poisson probability of one or more 
workers having seven or more accidents is roughly equal to the P found with case (iv). 
Continuing with case (iii), it is necessary to curtail the distribution at a final class of x = 6 
and more, with an expectation of 0-064. Using equations (10) it is found that 


G = 1381-122 = 161, E(@)=6, varG = 12—(49+14—2)/122+419-5 = 31-0, 
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or with the correction suggested under case (iii) of § 2, 


E(G) =5, varG = 25:8, 
and by equations (5), 


c¢ = 10/25-8 = 0-387, v= 0-387x5 = 1-94, y? = 0-387 x 16-1 = 6-23, 


which is just significant at the 5% level. 

Application of the classical test requires a further curtailment at a final class of 2 = 4 and 
more with expectation 2-1. The result is y? = 2-43, vy = 3, which is not even significant at the 
30% level. The strong evidence of the worker with seven accidents is completely destroyed 
if only the information that he had four or more accidents can be used. 


Example 5 


As an illustration of the application of the test of case (v) of § 2 to an assumed even dis- 
tribution of points on an interval, we take the 108 explosions in British mines, in a period 
of 26,263 days, recorded by Maguire, Pearson & Wynn (1952). The period is divided into 
108 subperiods of 243-176 days each: 














x Ny Ny xn, Re 
0 44 0 0 39-7 
1 37 37 37 39-7 
2 17 34 68 19-9 
3 5 15 45 6-6 
4 3 12 48 
5 2 10 50 } sis 
108 108 248 108-0 























According to equations (13) and (6) we have: 
x? = 108 x (248 — 109)/107 = 140-3, v= 108, wu = ./280-6—,/215 = 2-09, 


giving P = 0-02. The approximation is certainly good for V = n = 108, since according to 
§4 it is already good for N = n = 16. 

Again, the last column gives the Poisson expected group frequencies #,, for a mean of 
1-00. The standard y? test gives y?= 5-46, v = 3, which is not significant at the 10 % level. 


6. IMPORTANCE OF TESTS OF HOMOGENEITY IN STATISTICS OF ABSENTEEISM 


With the rising standards of life and social security the importance of psychological and 
sociological grounds of human absence behaviour seems to increase in relation to purely 
medical reasons. The urgent need of a rational control of absenteeism calls for more know- 
ledge of this complex phenomenon. If statistics are to be of any service for this difficult 
problem, some statistical model must be accepted first. The following is an extension of the 
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model invented by Greenwood & Woods (1919) when dealing with industrial accidents. For 
further reference see the extensive review of Arbous & Kerrich (1951). 

Imagine that every worker has a private urn, with many white and few black balls, that 
he is frequently drawing a ball with replacement and that he has an accident, or starts an 
absence, if he draws a black ball. The proportions of black balls in the urns may be different. 
They are associated with the personal liability for accidents or absences of the workers 
concerned and with the risk and stress of their work. The same could be said of the speed of 
drawing balls. However, it would make no difference if the number of black balls of a certain 
worker were halved and his speed of drawing were simultaneously doubled. So we are free 
to associate the proportions of black balls and the speeds of drawing to different groups of 
causes. Let a worker’s proportion of black balls be associated with his personal character- 
istics and his place and function in the concern, things that are relatively constant in time. 
Let his speed of drawing be associated with factors that are relatively variable in time and 
may roughly be divided into the following groups: 

A. Homogeneous factors, that are the same for all workers of the observed population, 
such as weather conditions, very contagious diseases, economic activity, etc. 

B. Inhomogeneous factors, that may be different for different groups of workers, such as 
moderately contagious diseases, changes in production scheme, in personnel policy, change 
of officers who are sociologically important, development of attitudes towards the concern, 
measures to improve working conditions and so on. 

Since Greenwood first discussed the matter, attention has been mainly focused on the 
personal liability of workers. It was in order to isolate this factor, that most of the studies, 
reviewed by Arbous & Kerrich, were on carefully selected, homogeneous populations, place 
and function being the same for all workers and relations being such that the inhomo- 
geneous factors (B) could hardly be suspected of being different for different groups of 
workers. As a result of those studies, it seems to have been established that the liability is 
different for different workers and that we may regard the proportion of black balls, multi- 
plied by a suitable constant, as having approximately a gamma-distribution in such homo- 
geneous populations. For more cautious assertions, see Arbous & Kerrich (1951). The 
practical consequences of these results were somewhat disappointing. 

In this context it seems only natural to pay more attention to the inhomogeneous factors 
(B), since these seem most liable to practical control. If one or more factors, which influence 
absenteeism, are varying in time, the speed of drawing will be multiplied by a number that 
is also varying in time. If those factors are changing similarly for all workers, the multi- 
plier will also change similarly for all workers, that is, it will be equal for all workers at 
any fixed moment. Without loss of generality we may consider that the drawings occur 
simultaneously in this case. So, in any subperiod, every worker would have the same 
number of drawings and the distribution of black balls in the m x n-design with (m — 1) (n—1) 
degrees of freedom would be homogeneous in the statistical sense. Conversely, if the test 
gives an insignificant result, it may be concluded that there is no reason to investigate the 
factors (B). In the opposite case one will conclude that a partial epidemic has occurred 
during the period of observation, that is, some of the workers have changed their absence 
behaviour in relation to the others. Several statistical methods might help to find out 
which of the factors (B) is responsible for that change. For example, if there are only two 
subperiods, one might single out a group of workers who had most of their absences in the 
first subperiod and another group of those who had them mainly in the second. Then one 
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has to consider in which other respects these two groups are different. At the present stage 
of our knowledge it will often be difficult to find a reasonab!s explanation for an apparent 
statistical inhomogeneity. A reliable diagnostic system can be built up only if a large 
number of tentative diagnoses is already available. 
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JUMP ANALYSIS 


By G. H. JOWETT ann WENDY M. WRIGHT 
Department of Statistics, University of Sheffield 


1. INTRODUCTION 


When a length of time series is such as to fall naturally into k successive sections, each con- 
sisting of n evenly spaced successive terms, it is often desired to carry out statistical tests 
to determine whether the probability structure of the series is unaffected by the transition 
from one section to the next or suffers a change in mean. If variation within sections is 
random, the conventional analysis relevant to this issue would be an analysis of variance 
between and within sections, the observed variance of section means being compared, using 
the F-test, with their hypothetical variance, estimated from the pooled variance within 
sections, under the null hypothesis of a common underlying population mean. In many 
applications, however, the structural assumption of random variation about a mean varying 
from section to section is replaced by that of sections of an autocorrelated series subject to 
different displacements of mean, potentially due to assignable causes, from section to sec- 
tion; when this is so, the analysis described above is no longer valid, and a modified or 
substitute form of analysis must be found. Unfortunately the analysis of variance pattern 
of test leads to two difficulties. 

The first of these arises because the observed means of different sections do not vary 
independently under the null hypothesis, with the result that the variance of section means 
cannot be taken as a chi-squared variate. This is not an insuperable difficulty, because at 
worst the expected value and standard error of this mean square can be worked out from the 
correlational properties of the series if these can be sufficiently well determined, thus 
providing a test of significance which is asymptotically valid. 

The second and more fundamental difficulty arises from the fact that the between sections 
mean square is essentially the mean semi-squared difference between all possible pairs of 
section means, and its expectation can only be determined if the mean semi-squared differ- 
ence between any such pair expected under the null hypothesis can be predicted. Un- 
fortunately this prediction always involves correlational properties of the series with lags 
greater than the length of a section, even if the means of the pair are those of adjacent sections, 
since terms separated by more than a section length are involved; these properties can only 
be inferred from the within section comparison provided by the data when it can safely be 
assumed that the autocorrelation dies away within the length of a section, or in the unusual 
event of extensive extrapolation being possible. To overcome this difficulty it is necessary 
to abandon the analysis of variance approach altogether, and find an alternative. 

The test proposed in this paper uses what will be called the between-section jump statistic 
B,, defined as the mean semi-squared difference (or jump) between the means of the last 
« (< 4m) terms of each section and the first « terms of the following section. 

In § 2 there will be developed an asymptotic test using B, as a criterion which is appro- 
priate under certain generally applicable working assumptions of smoothness; in §3 this 
test will be given more detailed examination, particularly as regards the choice of an @ for 

maximal sensitivity, under the more restrictive but still widely applicable assumption of 
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locally Markoff variation (defined and discussed in Davies & Jowett (1958)). Under favour- 


able conditions, and for suitable choice of a, it will be shown that the statistic 


J, =\n(B,/W,), (1) 


(where W, is the mean semi-squared difference of all comparable within-section jumps) has 
expectation zero and approximate asymptotic standard error given by 


oy, = /(E) (2) 


under the null hypothesis, and provides the test of significance required. 


2. SAMPLING VARIANCE OF B « IN TERMS OF SERIAL VARIATION PARAMETERS 


Denote the series by x(t), the kn observations being given by ¢t = 1, 2,...,4n and having 
structure — 
x[(s—1)n+u] = w,+E(s—1 n+), (3) 
where &(£) will be taken as a stationary series for purposes of exposition, though this assump- 
tion may be relaxed and the techniques developed in this paper applied to the more general 
class of smoothly heteromorphic series (following the principle established in Jowett (1957)), 
where only the local variational properties of £(t) are effectively stationary. 
The second-order variational properties of £(¢) are defined by the serial variation function 


4(r), given by the formula 8(r) = EGIE(t) E+) P. (4) 


a symmetric function of 7 related to the more commonly used variance o* and autocorrela- 
tion function p(7) by the equation 


6(7) = o*(1—p(7)). (5) 


Denote the average of a number « of successive values of £(¢) centred at t, called a central- 
ized average, by &,(t); thus, for example, £,(14) is the mean of £(1) and &(2). For successive 
values of t, £,(t) is itself a stationary series, with second-order variational properties defined 
by its serial variation function 


6z,(7) = E{g[E.(t) — ,(¢ +7) 3. (6) 

which for convenience in printing will be denoted by 6,(7). It is related to 6(7) as follows: 
d(0) = 55 SS BIE 4—Ja+-m)—Et—4—Jat7+m)] 

x [E(¢-4-—}a+m’)—S(t—}-—a+7+m’)] (7) 


= 5D, EB W{-HElt-4-Ja-+m)-Et-}- Jot my} 
—HE(—4—Ja+7+-m)—E¢-}—Jotr+m'yp 
+ HEC —4—ta+7+m)—Et—4—Ja+m’yp 
+HE(t¢—}— }a+m)—E(t-}—-fat+7+m')P} 
l a 





= 5 ¥ [8en—m’—1)—28(m—m’)+3(m—m’ +7), (8) 


9,2 — 
2a m=1 m’=1 


Be. ie (1) asm 3 “Ss” (1-4) 8(r). 


20 p= —(a—1) r=—(a—1) 
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We may interpret 6,(7) as a repeated centralized average of 4(r) adjusted by a constant 
to make 4,(0) identically zero. In most practical applications 6(7) has a cusp at 7 = 0 and 
rises monotonically, for some distance at least; 6,(7) behaves somewhat similarly, except 
that the cusp and curvature near the cusp are somewhat less pronounced. In other words, 
6,(7) is a smoothed and translated version of 6(7). To evaluate (9) exactly it is clearly neces- 
sary to know 46(7) up to the lag7 +a —1, the values round 7 being particularly critical. 

The expectation, under the null hypothesis, of the semi-squared difference between the 
means of the sth and s’th section is 8.(s—e'n), (10) 
but the serial variation statistics 

d(r) = Av 3[a(t)—2(t+7)? (11) 
required to estimate d(7) may be computed from within-section differences only up to 
7 =n-—1. It is thus evident why such expectations cannot in general be estimated and 
why the analysis of variance test based on them cannot in general be applied. 

The between-section jump statistic B, is given by 


k-1 
B, = — a $(%,(sn + $+ $a) —%,(sn +4 —}$a))?. (12) 
a’ 
Under the null hypothesis that , is constant for s = 1,...,4, we have 
E(B,) = 6,(a). (13) 
Now for a general normal stationary stochastic series y(t) it may be shown (cf. Jowett, 
19554) that cov [$(y(t)—y(t+7))*, My(t’)— yt’ +7))] = HATO, (t-0YP, (14) 


where A’ is the second central difference operator at interval 7. 
Applying this formula when y = £,, under the null hypothesis of constant ~, we have 


k ok —— 
var B, = = = DB BAadals— 8’)? (15) 
1 +(k-2) 
= ig 2B gh UE) eealnmyy (16) 
Pas 2 24'S 
= a 1 (218,@)) +¥ $n) say, (17) 
where o,. = (1 -{"\) [Az 6,,(mn)]? (18) 
, +(a—1) * 2 
= ( -{") [2 5 (1-2) aceemn +n) ; (19) 


the form (19) being obtained from (18) by substituting (9), the second term of which is 
annihilated by the differencing operation. 

The first term in the bracket in (17) may be calculated from values of 6(7) which go only 
as far as T = 2a —1, to estimate which serial variation statistics are available from within- 
section comparisons provided that « < 4n. No direct estimates of d(mn+r) (or equi- 
valently, 3,(mn)) are available for |m| > 1, but in principle at least the second differences 
in (18) may be estimated by averages of all possible products of within-section differences 
at lag mn +r, since if 


Palt, 7) = [%q(t— 37 — 3%) —%,(t— 47 + $a)] [%(t+437—-}a)—Z(t+47+4a)], (20) 
E{p,(t,7)} = 426,(7), (21) 
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whenever t, t+ a~ come from the same section and ¢+7, t+7-+« come from the same section. 
The difficulties attached to this procedure arise through the necessity for graduating these 
averages by a suitable formula (cf. Bartlett, 1946) before substituting in (13). 

It is also useful to have a formula for the asymptotic logarithmic variance of B,: 


k-2 
var (log, B,) ~ i (2+ > Yn} ’ (22) 
—1 m=1 
®,, 
where TY, = (3, (a) 2” (23) 


since the essentially positive nature of B,, coupled with some evidence from sampling 
experiments, suggest that log B, will have a sampling distribution closer to normal than 
that of B,. The null hypothesis may thus be tested by comparing 


log (B,/8,(%)), (24) 


with a suitable multiple of its standard error, which is the square root of (22). 

Since the variance of the location parameters j, is estimated by subtracting from an 
observed value B, its expectation under the null hypothesis, given by (13), the asymptotic- 
ally optimal value of « from the point of view of power is that which proves to give the 
smallest value for (17). In the general case a special ad hoc investigation is called for, though 
consideration of the limiting cases of strictly random variation within sections at one 
extreme and smooth continuous variation at the other suggests (as confirmed in §3 below 
under the more restrictive assumption of locally Markoff variation) that smaller values of 
a are likely to be optimal for smoother series, and vice versa. 

The general technique just described is both laborious and difficult to apply in practice; 
two forms of simplification are possible. The first, discussed here, applies if 4(7) satisfies 
certain smoothness assumptions; the second, discussed in § 3, applies if 6(7) may be repre- 
sented approximately within a suitable interval centred at 7 = 0, by a function of the form 


A(1—p''). (25) 


The assumptions are less restrictive in the first case, but it is necessary to fit the serial 
variation function 6(7), whereas in the second case this is avoided. 

The statistic B, is a member of a class of statistics called local statistics, the general 
sampling properties of which have been discussed in Jowett (1955a). It is a function of local 
comparisons, being a mean square of differences involving adjacent groups of terms of the 
series, and therefore its sampling formulae are dominated by terms involving 6(7) for values 
of 7 which are small compared with the total time interval covered by the data. The neces- 
sary condition on 6(7) is that it shall become increasingly linear as 7 increases, in the sense 
described in Jowett (1955a), with the result that 6,(7) also becomes increasingly linear and 
Ay.6,(7) converges rapidly to zero (increasingly so the smaller the value of «) as |7| increases. 
If this condition is satisfied the terms ©®,, of (17) make contributions which decrease rapidly 
as m increases. The whole of (17) will be dominated by the first term in the bracket, and the 
size of this will furnish a good guide for the optimization of the choice of «. Having pro- 
visionally chosen a in this way, the practical statistician may reasonably follow the course 
of neglecting all but possibly the first one or two of ®,, Dg, ..., ®,_,,unless there is positive 
evidence to warrant either their retention or a reduction in @ sufficient to justify their 
omission. 








390 Jump analysis 


Such evidence of a rather subjective kind may be furnished by scrutiny of the variogram 
constructed from the statistics d(1),...,d(m—1), which may give some indication of rough 
upper bounds for the second differences Aj.d(n—a+ 1), A}d(n+a—1), which straddle the 
range of second differences A?4(7), from which Aj.6,(n) is computed as a weighted average. 
More objective evidence would be the significance of a relevant mean of the product 
p,(t, mn) of (20); this is essentially a straightforward mean of terms from a time series, albeit 
generated from £(¢) in a rather complicated way, and its standard error may be estimated 
by the general method given in Jowett (19556). Neither method is particularly sensitive, 
but at least they provide some opportunity to check the working assumptions. 

Usually advantage may be taken of the availability of a direct estimate of the dominant 
term of (17) which does not require explicit consideration of the serial variation curve. 
Corresponding to the statistic B,, which is the sum of semi-squares of what might be 
described as between-section jumps, is a statistic W, which is the corresponding sum of 
semi-squares of comparable within-section jumps, defined by 


k n-a@ ae sts, 2: 
W. = cga suri) © E Mea@— Int d+u+ to)—z@—In+h+u—day (26 


Independently of the truth of the null hypothesis, 
E(W,) = 6,(«), (27) 


so that W, provides an estimate both of the expectation of B, under the null hypothesis 
and of the dominant term in (17). On general grounds the relative error of this estimate may 
be expected to increase with a, being the same as that of B, for a = 4n, and this must be 
borne in mind in using it in (17) as a basis for a large-sample test of significance. More 
explicit consideration of the error of W, under the assumption of locally Markoff variation 
will be given in §3. 

Although the exposition above is strictly applicable only under the assumptions that £(t) 
is stationary and normal, the test is robust under departures from these assumptions in 
certain directions. It was shown in Jowett (1957) that formulae for sampling properties 
of local statistics in stationary series could be applied as an approximation in non-stationary 
series having the properties of smooth heteromorphy and acting normality. For a full 
discussion the reader is referred to the paper in question, but the implications in the present 
case are that formulae such as (17) may be applied when the variation in £(t) is locally homo- 
geneous and normal within intervals comparable in length to the maximum lag 7 for which 
the terms of the formula are non-negligible. The rapid convergence of (17) is vulnerable 
mainly to short-term periodic tendencies of autoregressive type, unless « can be taken as 
a multiple of the quasi-periods. 


3. THE JUMP CRITERION J, AND ITS APPLICATION IN LOCALLY MARKOFF SERIES 


If « is small enough for the error in W, to be neglected, and for all except the dominant term 
in (17) to be neglected, it becomes possible to carry out the test of significance for B, by 
calculating the jump criterion J, defined by (1) and testing it against a null hypothesis 
value of zero, using its asymptotic sampling variance as given by (22), which reduces to 
2/(k—1). In this form the test itself becomes extremely simply to carry out, the difficulty 
being rather one of assessing whether the conditions necessary for its validity are satisfied 
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Also, considerations of power may dictate a value of « larger than that for which they can 
be satisfied. 

To explore the potentialities of J, as a test statistic, a numerical investigation was carried 
out using the two parameter Markoff serial variation function: 


d(7) = A1—p'"). (28) 


Numerical values taken were p = 0-0(0-1) 0-9, 0-95; n = 12; k= 00; a=1(1)6. It was 
unnecessary to take numerical values for 0, which as a scale parameter does not affect the 
sampling properties of J,. 


From (16), for large k 1 +2 
var B, ~ 3 XY F[Azs,(mn)}?. (29) 
kori l ud k "et n— Ane 
Similarly, WAM = pag oa pip Dy Se EMMA nwt, (80 
1 +o n—-a n—-a@ ng 
x YX Y HAs, (mn—-—ut+w')P, (31) 


~ k(n — -2%+1)? m =—-0 U=a U’=a 


k-1 k n-a@ 
cov (B,, W,) = kik — jer 3a) 2 > = MAss(e—8 s'n+u)]?, (32) 
: SS Hate (mn+u)p. (33) 


~ (2041) moo sine 


Formulae (29), (31), (33) were computed in the cases described, taking 6 = 1, and the 
results substituted in the formulae 


var (B, — W,) = var B, — 2 cov (B, W,) + var W,, (34) 
var (B, —W,) 
var J, ~ (ze ; (35) 


to give the results in Table 1. 

The following broad conclusions emerge. 

(i) Below p = 0-6, kvar(B,—W,) decreases to a minimum at or near a = 6 (i.e. 4n); 
above p = 0-6, it increases from a minimum for « = 1 (i.e. the smallest possible value of «). 
This indicates which values of « give the most powerful test. 

(ii) The pattern of rise or fall in k var (B, — W,) with a strongly resembles that of E(W,), 
except for p = 0-6, where variation in either is small. This confirms the appropriateness in 
this case of the practice recommended in § 2, namely, choosing as the optimal a the value 
minimizing H(W,). 

(iii) Below p = 0-6, kvarJ, < 2-2 when « < 5 (actually when a < 6 for p < 0-3); at and 
above p = 0-6, kvarJ < 2-2 when « = 1. Thus if we are prepared to content ourselves with 
a slightly suboptimal value of « in certain cases, we can cover ourselves against errors of 
the first kind at least to the usual extent by working with the formula (2) which provides 
us with a simple test along the lines described at the beginning of the section which is not 
restricted to small values of a in cases where a large value of « would be optimal. The factor 
2:2 is a convenient round number which is not unacceptably large. 

(iv) Inthe cases mentioned in (iii), the error committed by neglecting all but the dominant 
term in (17) proves to be negligible. Evidently the increase of the factor from 2 to 2-2 is 
occasioned by the need to allow for the sampling error of W,. 
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Table 1. Variances and covariances of jump statistics for Markoff serial 
variation function with n = 12 

































































p 
0-0 0-1 0-2 0-3 0-4 0-5 0-6 0-7 0-8 0-9 0:95 
a 
k var B, 
1 2-000 | 1-620 1-280 | 0-980 | 0-720 | 0-500 | 0-320 | 0-180 | 0-080 0-020 | 0-005 
2 0-500 | 0-540 0-558 | 0-548 | 0-508 0-440 «=| 0-346 | 0-237 § 0-127 0-038 1 0-010 
3 | 0-222 | 0-268 0-315 | 0-357 | 0-385 0-389 0-359 =| 0-290 | 0-183 1 0:065 1 | 0-019) 
4 0-125 | 0-160 | 0-200 | 0-246 | 0-292 | 0-329 | 0-344 | 0-316 1 | 0-233 2 | 0-096 3 | 0-0313 
5 | 0-080 | 0-106 | 0-138 | 0-178 | 0-225 | 0-274 =| 0-315 2 | 0-326 2 | 0-274 4 | 0-1306 | 0-0444 
6 0-056 | 0-075 0-100 | 0-134 | 0-177 7 | 0-2301 | 0-286 2 | 0-3296 | 0-313 10 | 0-167 10 | 0-0597 
(Percentage increase over dominant term alone shown in italics) 
k var W, 
1 0-265 | 0-202 0-152 | 0-111 | 0-078 0-052 =| 0-032 0-018 | 0-008 | 0-002 | 0-001 
2 | 0-088 | 0-096 | 0-100 | 0-100 | 0-094 0-082 | 0-066 0-046 0-025 | 0-008 | 0-002 
3 0-058 | 0-074 0-089 | 0-104 | 0-116 0-120 0-114 0-095 | 0-063 | 0-023 0-007 
4 0-045 | 0-061 0-080 | 0-102 | 0-126 | 0-148 0-161 | 0-157 | 0-124 | 0-055 0-018 
5 0-047 0-065 0-089 | 0-119 | 0-156 | 0-197 0-235 | 0-253 | 0-222 | 0-103 0-038 
6 0-056 | 0-075 | 0-100 | 0-134 | 0-177 | 0-230 0-286 | 0-329 0-313 | 0-167 0-059 
k var (B, — W,) 
1 2-083 | 1-701 1-354 | 1-043 | 0-770 0-537 | 0-345 | 0-195 | 0-087 | 0-022 0-006 
2 0-519 | 0-563 0-586 | 0-581 | 0-545 | 0-477 | 0-381 | 0-265 0-145 0-045 0-012 
3 0-231 | 0-281 | 0-331 | 0-378 | 0-412 0-423 | 0-400 | 0-333 | 0-221 0-083 0-025 
4 | 0-123 | 0-158 | 0-199 | 0-246 | 0-294 0-338 0-368 | 0-363 =| 0-299 0-140 0-047 
5 0-074 | 0-098 | 0-128 | 0-168 | 0-218 0-281 0-353 0-418 0-414 0-218 0-078 
6 | 0-056 | 0-077 | 0-108 | 0-151 | 0-212 | 0-298 | 0-412 | 0-536 =| 0-574 | 0-325 0-113 
E(W,) 
1 1-000 | 0-900 | 0-800 0-700 | 0-600 0-500 0-400 | 0-300 | 0-200 | 0-100 0-050 
2 0-500 | 0-520 0-528 | 0-523 | 0-504 0-469 0-416 | 0-344 | 0-252 | 0-138 0-072 
3 0:333 | 0-366 | 0-397 | 0-422 | 0-439 0-441 0-424 | 0-380 | 0-302 0-179 0-097 
4 | 0-250 | 0-282 0-317 | 0-351 | 0-382 0-406 0-414 0-396 | 0-338 | 0-216 0-122 
5 | 0-200 | 0-230 | 0-262 | 0-298 | 0-335 0-370 0-396 | 0-400 | 0-363 | 0-248 0-145 
6 | 0-167 | 0-193 | 0-224 | 0-258 0-297 | 0-337 0:373 | 0-395 | 0-378 | 0-275 0-167 
E(W,)/E(W,) 
1 1-000 | 1-000 | 1-000 1-000 , 1-000 1-000 1-000 1-000 | 1-000 | 1-000 1-000 
2 | 0-500 | 0-578 0-660 | 0-747 | 0-840 0-938 1-040 1-147 | 1-260 1-380 1-440 
3 | 0-333 0-407 0-496 | 0-603 | 0-732 0-882 1-060 1-267 1-510 1-790 1-940 
4 0-250 | 0-313 0-396 | 0-501 | 0-637 0-812 1:035 | 1-320 1-690 2-160 2-440 
5 | 0-200 | 0-256 | 0-328 | 0-426 | 0-558 0-740 0-990 1-333 1-815 2-480 2-900 
| 6 | 0-167 | 0-214 | 0-280 | 0-369 | 0-495 | 0-674 0-933 | 1-317 1-890 2-750 3-340 
| k var J, 
|} 1 2-08 2-10 2-12 2-13 | 2-14 2-15 2-16 | 2-16 | 2°17 | 2-18 | 2-20 
| 2 2-07 2-09 2:10 | 2-12 | 2-14 2-17 2-20 | 2-24 | 2-29 2-35 | 2-37 
| 3 2-08 2:10 | 2-10 2-12 | 2-14 2-17 2-22 2-30 | 2-43 2-60 2-70 
| 4 1-97 1-98 1:99 | 2-00 | 2-02 2-06 2-14 | 2-31 | 2-61 3-00 3:17 
| 5 1-84 | 1-85 1-86 1:89 | 1-94 2-05 | 2-25 | 2-61 3-14 3-55 3°73 
| 6 2:00 | 2:06 | 2-15 2:26 | 2-41 | 2-63 | 2-96 | 3-44 4-02 4-30 4:07 
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The numerical investigation just described involved a considerable amount of com- 
putation and yet covers only one particular case. To justify application to other cases it 
was necessary to examine the relevance and robustness of the conclusions drawn under the 
following conditions: (a) departure of the serial variation function from strict Markoff form; 
(b) departure of n from the selected value n = 12; (c) the value of k not large. 

(a) If the serial variation function may be represented in the form 


6(7) = 0(1—p'™') + (7), (36) 


where (7) is negligible for 7 < 2«—1 and has a second derivative so small that A?w(r) is 
negligible for 7 > a, the effect of the term (7) on (21), (23) and (24) will simply be the 
addition of an average of these negligible quantities. This more general form of serial 
variation function, described as locally Markoff and discussed in a wider context in Davies 
& Jowett (1958), appears to have sufficient extra flexibility to serve as a robust working 
assumption in the absence of quasi-cycle variation, analogously with the normal distribu- 
tion in classical theory. 
For example, the function 


d(r) = 0-81(1 — 0-8'7!) + 0-19(1 — 0-27!) (37) 


illustrates a common type of divergence from the strict Markoff form, namely a slower 
decline to zero in the initially steep gradient, and is roughly representable in the form (36) 
for p = 0-6. The sampling parameters of Table 1 computed for this function proved to be 
similar in essentials in their pattern of variation, « being nearly optimal at 1, H(W) varying 
little, and k var J being less than 2-2 for a = 1 (and also a = 2,3). 

The robustness of the assumption of locally Markoff serial variation is of course greater 
the smaller the value of «; however, its use need not be restricted to small values of a. 

(b) Supplementary investigations similar to that yielding Table 1, taking n = 6, 
a = 1,2,3, p = (0-2), (0-6)? and (0-8)? and n = 24, a = 1,8, 10,11, 12, p = /0-2, /0-6, /0-8 
were carried out to test the extension of conclusions (i)—(iv) to smaller and larger values of n. 
The values of p taken are equivalent to 0-2, 0-6, 0-8 in terms of section length. 

It was found that, as before, the patterns of k var (B, — W,) and E(W,) were similar; a was 
optimal at unity for the highest values of p and not greatly suboptimal at one less than its 
maximum for the lowest value of /. 

There was also evidence to suggest that the transitional value of the Markoff parameter 
in terms of section length increases with n, the middle value being below the transition for 
n = 6, above it for n = 24. Also 2-2 remained a safe approximation to k var J, in the corre- 
sponding cases. 

(c) Asarough check on the applicability of the asymptotic theory for values of k not large 
and to gain experience, jump analyses were carried out for twenty artificially constructed 
Markoff series in each of the cases: 


p=08; n=6,k=10; e=1. 
p=0-36; n=6, k= 10; = 2. 
p=02; n=6,k=10; a=3. 
It was found that the distribution of J, was consistent with symmetry, satisfying a x? 
goodness of fit test for a normal curve, and that only two out of the sixty values fell above the 
quasi-normal upper 5 % point at 0-813 (= 1-645 ,/[2-2/9]). 


25 Biom. 46 
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The formula (2) is therefore proposed as the usual criterion for the significance of J, on 
the grounds that it makes a small allowance which is adequate to cover the non-vanishing 
terms of the summation in (16) and the error of W, at least in the appropriate Markoff cases, 
and which in the absence of evidence to the contrary, may reasonably be taken to do so in 
other cases, or at least will be better than making no allowance at all. 


4. COMPARISON WITH THE ANALYSIS OF VARIANCE TEST 


Since only adjacent sections are contrasted in jump analysis, one would expect some loss 
of power compared with a method based on the same principle as the analysis of variance, 
where the means of all possible pairs of sections are effectively compared. The latter 
approach is, of course, not usually possible when the serial variation curve is not known fairly 
completely; nevertheless, the comparison is still of some interest, and we shall draw it for 
large k and an assumed random normal variation in the parameters , characterized by a 
standard deviation o,, taking the serial variation curve as being unit Markoff in form, and 
taking n = 12 as in §3. Under these conditions, writing B,) as the value of B, under the 
null hypothesis, we have 


k-1 k-1 _ 
(k f 1) B, = (k i. 1) Bio + a (Hg —s-1)° + ‘ [£,(sn = $ + $a) a E,(sn + 3 im 3a)] (Hs a Hs_1)- 
8= 8= 
(38) 
The covariances of the three terms on the right-hand side of (38) are clearly zero; the variance 


of the first is given by (17), and the variances of the second and third are easily shown to be 
given respectively by 


a4 (3k — 4) (39) 
and 2(k — 2) o7[A25,(0) —A26,(n)] + 207.8,(n), (40) 
so that asymptotically 
kvar B, ~ kvar Bi) + 304,+ 20%[A75,(0) —AZ6,(n)]. (41) 
Also E(B,) = E(B, )+ 7%; (42) 
so that the estimate of 7% given by 
Est,03 = B,— E(Byo) = B,— E(W,) (43) 


has the same sampling variance as B,. 
Alternatively, we have an estimate from the calculated variance var,Z,, of section means 
iven b i 
8 y Est, 07, = var,%, —E[var,£,]. (44) 
Hence, similarly to the preceding argument, 
var (Est, 0%) = var (var, Z,) 
= var (var, £,,) + var (var, ) + 2 var [2 cov, (E,, /4)], (45) 
where the suffix c implies that the variance or covariance is a calculated, not expected, one. 
Now, using formula (5) of Bartlett (1946) we have 


2 + 
var (var, £,,) ~ f= {en(2) — 62,(8n)}? (46) 
8=-—-a 
am 20% 
and it is a standard result that var (var, /) ~ si A (47) 
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For the third term 


i 4 kos a 
var[2e0ve Eq A] ~ aH, ¥ Elle 1) n+h-+ 4m)El(6'— 1) +4440) te ates} 
8,8 = 
4k ™ 
~ ke + a7, var [En((s = 1) n+ 3 + 3n)] 
s= 

ae / 
=p? u(Oz,(00) — dz,(0)). (48) 
We have used these formulae for asymptotic variances to compute the efficiency of 
Est, 0; relative to Est, oj for x = 12 and a number of different values of a, p and o,,. 


The results are given in Table 2. They all settle down to a limit of 67 %, i.e. § as 7, > 0, 


Table 2. Asymptotic efficiency of estimate of a7, from B, compared with 
that obtained from var,=,, 











oa p=00 p=01 p=06 p= 0-95 
(unit = 02) Opt.a=6 | Opt.a2=6 | Opt.a= Opt. a=1 
} 54 54 63 1700 
1 63 63 67 262 
4 66 66 67 99 
25 67 67 67 71 
100 67 67 67 68 




















Table 3. Increased sampling variance of estimate of a7, from var, Zp, 
due to presence of within-section serial correlation 


Ratio of variance (percentage) to that for estimate from random data. 




















2 
(unit = 02) p=01 p=06 p= 0-95 
} 110 254 1818 
1 103 140 | 846 
4 101 110 | 145 
25 100 102 #3| ~~ 106 
100 100 100 = «i101 
| 








approaching it from above in the cases p > 0-7 and from below in the cases p < 0-6, the 
optimum jump-analysis value of « being chosen in each case. In the cases where the limit 
is approached from above, the relative efficiency can be considerably greater than 100% 
when o’, is of the same order as ¢; (which has the value of unity in our calculations) or smaller. 
It is in precisely these cases that W, supplies a stable estimate of Z(B,,), so that this asymp- 
totic comparison has a relevance to practical jump analysis; in the other cases it is more 
abstract, since W, has a variance comparable with that of B,,. However, the graphs are 
consistent with what we would expect on general grounds, namely, that as the variation 
of £(t) approaches randomness, a test based on variance is more efficient than any jump 
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analysis. It should not, of course, be forgotten that even in cases where the acting Markoff 
serial variation curve has a small parameter, the true curve may diverge from it for lags 
greater than the section length, rendering a test based on variance impossible to carry out, 
for the reasons given in § 1. 

That jump analysis can be more efficient than test based on variance, were such a test 
practicable, in cases where the parameter is large is not intuitively surprising; a large 
parameter reflects smoothness, and discontinuities from section to section become in- 
creasingly obvious as this smoothness increases. This gain can be almost large enough to 
offset the loss due to the presence of serial correlation, as can be seen from comparison of 
Tables 2 and 3; this occurs when the entry in Table 2 is almost as large as the corresponding 
entry in Table 3. The information yielded by jump analysis on serially correlated data asa 
fraction of that which would be yielded by variances of random data with the same dis- 
persion is given roughly by dividing the former by the latter, and is evidently sometimes 
quite small. 


Illustrative example 


The data in Table 4 below consist of 156 terms of an industrial time series which has been 
arbitrarily broken up into k = 12 successive sections of n = 13 consecutive terms zach. 
The analysis given is fuller than would be warranted in many practical applications, but 
even so begs certain questions (e.g. testing the goodness of fit of the serial variation curve) 
which lie outside the scope of this paper and might reasonably form the subject of inquiry. 


Table 4. Weekly determinations of percentage of free carbon in coking 
coal as charged (origin = 46, unit = 0-1) 















































Section 

| | 
Term) 1) 2 | 3 | 4 5 6 7 8 9 10 | 11 | 12 

os oe 

| | 
1 | 40 | 71 | 67 | 61 | 39 | 46 | 58 | 35 | 78 | 46 | 24 | 38 
2 | 27 | 47 | 69 | 49 | 27 | 65 | 51 | 50 | 32 | 38 | 32 | 32 
3 | 40 | 65 | 58 | 65 | 46 | 60 | 38 | 43 | 43 | 21 | 35 | 25 
4 | 45 | 44 | 58 | 50 | 53 | 62 | 57 | 54 | 43 | 26 | 35 | 27 
5 | 34 | 6 | 70 | 59 | 45 | 50 | 53 | 41 | 52 | 24 | 28 | 19 
6 | 42 | 53 | 48 | 58 | 25 | 59 | 64 | 45 | 56 | 38 | 44 | 26 
7 | 30 | 56 | 53 | 58 | 35 | 46 | 69 | 34 | 46 | 30 | 46 | 28 
8 45 | 61 56 | 44 | 34 | 50 | 58 | 40 | 44 = 42 36 | 22 
9 | 54 | 58 | 59 | 49 | 29 | 53 | 711 | 48 | 36 | 37 | 33 | 41 
10 | 50 | 96 | 66 | 45 | 37 | 72 | 52 | 50 | 40 | 19 | 31 | 34 
11 | se | 72 | 72 | 45 | 48 | 56 | 43 | 35 | 35 | 44 | 37 | 26 
12 | 57 | 86 | 66 | 51 55 | 56 | 27 | 21 | 35 | 39 | 28 | 29 
is | 49 | 47 | 68 | 46 | 36 | 57 | 23 | 50 | 49 | 25 | 32 | 33 

} 





The first necessary step is to calculate and compare two estimates of W, from alternate 
sections for « = 1, a = 6. We have calculated it for these and for intermediate values of «, 
the bulk of the labour consisting of computing moving differences of moving averages of 
a terms for the 156 successive terms. 
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In view of the remarks in the paragraph following that where (16) occurs, a choice of 
a = 6 seems optimal, giving the value 30-1 for W,. The value of B, for « = 6 is 32-1, so that 
J, = In (32:1/30-1) = 0-0643. (49) 

From (2) the prima facie standard error of J, is 
(22/11) = 0-447, (50) 


so that J, is not significant on the 5 % level. No further analysis is necessary, since any factor 
more correct than 2-2 in (50) would certainly not be less than 2. In more critical cireum- 
stances, however, the need for revision of (50) might be investigated along the following lines. 


(a) Scrutiny of the variogram 
For fitting a locally Markoff serial variation function to the data the least-squares method 
(Davies & Jowett, 1958) would be appropriate; however, the breaking of the series into short 
sections and the non-availability of d(15) prevent the tables in this paper from being used, 
and circumstances do not justify the labour of an ad hoc computation. However, as was 
done by Davies & Jowett it may fairly be assumed that the fitted function will reproduce 


Table 5. Values of W, for coal series 








a Odd sections Even sections All sections 
1 61-9 87-8 74-9 
2 48-0 34:9 41-4 
[3 51-1 31-4 41-2 
4 50-8 31-9 41-3 
» 35:1 32-5 33-8 
6 27°5 32-7 30-1 




















d(1) {which is identical with W,), and it is both convenient and reasonable to obtain the 
remaining equation necessary for the estimation of the two Markoff parameters in (28) by 
equating W/W, to E(W,)/E(W,) in Table 1. 


Hence from Table 5 
Est 0(1 — Est p) = 74-9, 


and the value of 0-402 for W,/W, suggests the value 0-33 for Est p (using inverse linear inter- 
polation in the appropriate section of Table 1), so that Est @ = 112. 

The variogram has been plotted in Fig. 1, this fitted curve being shown. The later serial 
variation statistics are very erratic, being based on only very few comparisons, and their 
departure from the fitted curve is probably of no consequence; the standard error of d(2) 
(cf. Table 11 of Davies & Jowett, 1958) is roughly 16 and the discrepancy of 26, though not 
convincing in itself, taken along with the evidence of d(3), d(4) and d(5) suggests a serial 
variation curve such as that shown by the broken line as being more likely prima facie. 
Following the suggestion in the antepenultimate paragraph of § 2, we extrapolate the curve 
to a value of 127 at 7 = 14, and find that Agd(8) = — 15 for the broken curve as compared 
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with — 12 for the Markoff curve; a fair working assumption is that Ag 6(18) is negligible for 
both curves. Taking the average suggests values of ’; in (22) of about 0-06, 0-04 respectively, 
both rather trifling. Hence the use of (2) is not contra-indicated. 
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Fig. 1. Variogram and fitted curves for free carbon series. 


(b) Assessment of mean product of differences 


Using (20) we obtain the following series of values (rounded) for p,(t, 13) from the data: 














t p(t) t p(t) t p(t) 

13 149 65 —] 117 —35 

14 233 66 0 118 —27 

26 5 78 0 130 7 

27 78 79 -3 131 -19 

39 -— $ 91 1 143 5 

40 —59 92 45 144 -18 
| 

52 - & 104 86 

53 —15 105 26 p:20 























If we were to take the mean at its face value, it would suggest a value of 0-4 for V’,, giving 


1 10 /20\? 2-4 
var (log 2.) +7 (2+ 77% (50) } = 
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suggesting a value slightly greater than that given in (50) for the standard error of J,. 
However, the variation in the figures is considerable, and there is no prima facie case for 
relying on this correction. 

It is possible to make only a rough estimate of the standard error of j, in view of the 
nature and shortness of the series. The serial variation curve for p,(t, 13) may be expected to 
be complicated in shape, in view of the many common values of x(t) involved in neighbouring 
values of p(t), and its estimation from the data is quite impracticable. However, the series 
of means of the pairs of adjacent values is more tractable, a fair working assumption being 
that the serial variation curve flattens out by a lag of about 26, for which the successive 
means have no values of x(t) in common. 

The analysis of semi-squared differences (following Jowett, 1955a) between the eleven 
resulting means is as follows: 


























Source Sum of s.s.pD.’s | Number Mean s.s.D. 
All 433,450 121 3,582 
Lag 0 0 11 0 
Lags +1 38,143 20 1,907 
Other lags 395,307 90 4,392 
Hence Est o5 = ./(4392 — 3582) = 28. 


This can only be a very rough estimate, as will be obvious to any statistician who looks 
at the figures on which it is based; however, it suggests that the value of » could easily be 
fortuitous, and there are no compelling grounds for making the correction suggested above. 


We would like to thank Dr H. C. Hamaker, who first drew our attention to the need for 
a solution of this problem; to Miss H. M. Davies and Miss D. Kirk, who have assisted us 
with the computation; and to Mr J. Hebden of the National Coal Board and to the Carbon- 
ization Branch of the Board, who kindly supplied the data. 
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THE RANDOM WALK [IN CONTINUOUS TIME] AND 
ITS APPLICATION TO THE THEORY OF QUEUES 


By C. R. HEATHCOTE anp J. E. MOYAL 


Australian National University, Canberra 


SUMMARY. The random walk in continuous time has been considered by a number of authors 
for particular types of boundary conditions, e.g. the unrestricted random walk by Irwin (1937); the 
random walk with one absorbing or one reflecting barrier by Ledermann & Reuter (1954), and sub- 
sequently by Bailey (1954). The present paper gives a unified theory of these processes which includes 
previous work on the subject as special cases and readily yields the solution for any other consistent 
form of boundary conditions. Several of these are of interest in the theory of queues with exponential 
service and arrival times. The random walk with reflecting barriers at O and N (§4) is the single server 
queue problem where the number of possible customers is limited to N (Morse, 1958). By letting 
N- © we obtain the familiar single server queue problem solved previously in the papers by Leder- 
mann & Reuter and by Bailey referred to above and also by Clarke (1956), Champernowne (1956) and 
Conolly (1958). 

The passage to a ‘diffusion’ process is also considered. The method given provides an alternative to 
the well-known method of images (Chandrasekhar, 1943; Bartlett, 1956). In §5 we consider a more 
general process where the transition intensities A and yw depend linearly on the particle’s position in a 
bounded interval, and are constant outside this interval. This enables us to give the Laplace transform 
of the probability generating function for the N server queue, a problem recently considered by 
Karlin & McGregor (1958). 


1. INTRODUCTION 


A particle will be said to describe a random walk in continuous time if it has constant 
chances A, per unit time of taking one unit step respectively to the right or left on the real 
axis. For prescribed boundary conditions, such as absorbing or reflecting barriers at given 
positions, we ask for the distribution of the particle’s position at time ¢ after the start of 
the process. Specifically, we choose the origin on the real axis (as can be done without loss 
of generality) so that the initial position of the particle at t = 0 isk, an integer, and for each 
of the processes studied we seek P,, ,,(t), the probability of a transition from k to n in timet. 
We use the ‘backward’ equation (Bartlett, 1956; Feller, 1957) 


aP,, ,(t) = — (A+) Py a(t) +AP esa, n(t) +H Pe-r nll): (1-1) 


The probability generating function G,(z,t) = ¥ F,,,(t) 2” then satisfies 


aed = —aG, (2, t) + AG .4(2,t) + wGy_,(2,t), as 


where « = A+y. Clearly G,(z, 0) = z*. Introducing the Laplace transform 
9(2,8) = L[G,(z, t)] = I e—* G,.(z, t) dt, 
0 
it follows from (1-2) that 
(8+2) 9p =AGerrt+ Hertz (Res > 0). _ (18) 


The range of k depends on the problem under consideration. Thus for the unrestricted 
random walk k = 0+1, +2, ...; whereas for a barrier at the origin k is restricted to the 
non-negative integers. 
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The system consisting of the backward equation, together with the side conditions, has 
a unique ‘honest’ solution (‘honest’ in the sense that > P,,,(¢)= 1, or equivalently 
n 


G,(1,t) = 1, g,(1, 8) = s~"); in fact ‘honesty’ implies uniqueness and conversely. This is 
well known to be the case for (1-1) without side conditions (cf. Bartlett, Feller, loc. cit.). 
With side conditions it can be proved to follow from the general theory of discontinuous 
Markov processes given in Moyal (1957) by transforming the system into an equivalent 
integral equation of the type considered in that paper; the explicit solution for the single 
absorbing boundary case is given as an example of the general theory in Moyal (1959). 
The method of obtaining explicit solutions developed in the present paper is however much 
less laborious, especially for the more complicated side conditions, but we still rely on the 
general theory for the fact that the ‘honesty’ of each solution thus obtained implies its 
uniqueness. 

Asymptotic equilibrium distributions are known to exist for this type of process. Their 
explicit expression is most easily obtained directly from the Laplace transform solution of 
(1-2) by an extension of Abel’s Theorem (Widder, 1946, ch. v), which yields 

lim sg;,,(z, 8) = lim G,(z, ¢), (1-4) 
s>+0 t>o 
whenever the limit on the right-hand side exists. It is in fact sufficient for (1-4) to be true 
that the right-hand side exist as a Cesaro limit. 

There is another type of asymptotic behaviour of the solutions which is of interest, 

namely, the passage to a ‘diffusion’ process. We effect this passage as follows: Let 


a=A+p=o7/h?, A-np=alh, x=nh, y=kh. (1-5) 


We assume that each step is of magnitude h, that o and a are constants, and we look for the 
distribution of the displacement x conditional on y as h > 0. Let 


f(x ly; t) = Tim Fyin,znlt)- (1-6) 


Carrying out this limiting procedure in (1-1) one arrives formally at the ‘backward’ diffusion 


equation with drift term 
7) oe é | ; 
apd (# lyst) = D dye t Mayfly O- (1-7) 


It will be shown later that this passage to the limit in the solutions of the random walk 
problems yields the solutions of (1-7) with the appropriate boundary conditions. 


2. THE GENERAL SOLUTION OF THE RANDOM WALK PROBLEM 


To obtain the general solution of the difference equation (1-3) we need a particular solution 
and the solution of the homogeneous equation obtained by omitting the term z*. Since the 
transition probabilities clearly remain invariant under translations of the origin it follows 
that G,(z, t) = z*G@(z, t) and hence 


gy(2, 8) = 2*9g,(z,8) (k= 0, +1, +2, ...). 
Substituting in (1-3) we have 


Jo(2, 8) = (8+ a—Az—pz 
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so that a particular solution of (1-3) is 
gu (2,8) = 2*6(2, 8). (2-1) 
Substituting g, = u* in the homogeneous equation we obtain 
u*(s+a—Au—pu-) = 0, (2:2) 


the two solutions of which are the zeros of ¢(z, s), namely 


U,(s) = (2A) [s +a—,/{(s +a)? —4Ap}] ! (2-3) 
and Ug(8) = (2A) [s+a+./{(s +a)? —4Ap}]. 
The general solution of (1-2) is therefore 
9,.(z, 8) = z*f(z, 8) + A(z, 8) w¥(s) + B(z, 8) uk(s). (2:4) 


We require that g, > 0 as s > 0, so, since u, > 0 and wu, 00 as s>oo, A must be 
bounded, while B must tend to 0 with 1/s faster than w3*. For problems involving only one 
boundary we set B = 0 so that (2-4) reduces to 


9(2, 8) = 2*b(z,8) + A(z, 8) uf(s), (2:5) 


where A is determined by the boundary condition. Boundary conditions at two points 
determine both A and B. 
If no boundary conditions are imposed we have the unrestricted random walk, the 


solution of which is (2-1). The Laplace transforms of the probabilities PF, ,(¢) may be readily | 
obtained by expanding (2-1) in a Laurent series for values of z in an annulus centred on the | 
origin, which contains the circle |z| = 1 for all s whose real part is positive. Clearly it is | 


sufficient to do this for k = 0. We have then 


gol 8) = AMug— m4) | B (wale) + & (lua (26) 


We note that g,(1,s) = 1/s, so that the solution is unique. The inverse Laplace transforms | 


may be found directly from tables (Erdelyi, 1954) to give for the probabilities 
P,,,(t) = Bre“ I,(2t/[Au]) (r= 0, +1, +2, ...), (2-7) 


where f = (A/)?, and J,(x) is the modified Bessel function of the first kind. Throughout | 


the rest of this paper the argument of Bessel functions will be suppressed, it being under- 
stood that it is 2¢,/(Aw). The mean and variance of the distribution (2-7) are easily found 
to be (A—)t and at = (A+) t respectively. Inverting (2-1) we have the generating function 


G,,(z, t) = z* exp [—t(a —Az—p2z-)], (28) | 


which may be expanded to verify the results obtained above. 
If now we substitute from (1-5) and set z = e", z* = eY in (2-4), then 


w(9,8,y) = lim gy,(e, 8) 
h—>0 


= e!Y (3 + 40°60? —iaf)-! + a(0, 8) exp | ay L /(2 +3 -%)} 


+b(0, s)exp {2 /( 
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where a(0,8) = lim A(e, s) and 6(6,s) = lim B(e*”, s) and w(0, s, y) is the Laplace—Fourier 
h—>0 h>0 


transform of the solution of (1-7), i.e. 


w(,8,y) = I «at | ete f(x | y, t) dx. (2-10) 
0 —o 
In the unrestricted case a = 6 = 0 and 


f(x |y; t) = (270%t)-texp - eon, (2+11) 


20° 


3. THE RANDOM WALK WITH ABSORBING BARRIERS 


If the states O and N, say, are absorbing barriers, the process stops whenever the particle 
reaches either O or N so the boundary conditions are 


Jo= 1/8, gy = 2/8. (3-1) 
The values of the constants A and B appropriate to this process are found by substituting 
the general solution (2-4) in (3-1); with these values of A and B (2-4) becomes 


Ie = (2 — ut) p + stuf + (8-1 — 9) (2X — uy’) (wf — wh) (uy! — ut) (3-2) 
= (uy — uy) [sy Ug)* (ug—* — uh—*) + ¥ “1A — U,)}* (ug — uf) (ug’* — uY*) 


N-1 
x (UU) + LY 2A y= ta) (uh — uh) (ult — al) + 2a ub} (3-3) 
r=k+1 


Inverting the Laplace transforms gives for the probabilities 


o rt 

Pr o(t) = prz - e-* {(25N + k) Dogy4n— [29 +1) N —k] Log iv wah a, (3-4) 
j= 

P, k,r(t) = pr “ae ogs0 Nbr + Tosn-ek—e — Log40 N-K-1 — | (r=1,2,.. ©), (3°5) 


P,,-(t) = Br a Ht Toy w+k—r + logy —nse — Log40 NK — Leg sesrd 
(r= k+1,k+2,...,N—1), (36) 
*) t 
P.v(t) = B® “= Rasy e-97 {[(29 +1) N —K] Loy y n-ne —[(29 +1) N +4] Tojsynsnh 47. (3°7) 
jJ= 


The absorption probabilities P, 9(t) and P, y(t) are symmetrical in A, w and k, N—k as 
expected. The expected value of the position r at time ¢ conditional on initial position k is 


E(r|k,t) = k+(A—p)t+(A—p) 5 [e-ne 
j=0J0 


x {B-*{[2(9 + 1) N — ke] Logi nn — (29 +k) Dagny} + PY*A(25 + DN +4] 
x To5 wn —[(25 +1) N —k] [959-233 47- (3-8) 
The generating function of the stationary distribution obtained using (1-4) is 
Foe G,(z,t) = tim o94(2, 8) = 1%, +2%y, 
"i a A)E—(uJA)® , 2X —(W/A)} 
6 * : a = vite if dem (3-9) 
1—k/N +2Nk/N if A=4, 
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which yields for the expectation E(r | k, 00) 


N{1—(wA}{1—-(w/AX}+ if A+ yp, 
E(r |k, co) = (3-10) 
k if A=yp. 

The boundary value problem discussed above is the continuous time analogue of the 
classical ‘gambler’s ruin’ problem, in which two gamblers with initial capital k and NV —k 
respectively compete. As may be deduced from general arguments the asymptotic results 
we derive are identical with those of Feller (1957, chapter x1v). The probabilities of ultimate 
ruin are 7, and y;, respectively given by (3-9). Since 7+, = 1 ultimate absorption at 
either one of the boundaries is certain, and the expected lapse of time before this occurs is 


B(t|k) = ["UPi lt) + Ph, wll) ae 


= (RP AO OPT Se—a = 8th ge 
b(N —k) (22) rity. 3 


The solution of the problem with a single absorbing barrier, say at the origin, may either 
be found from the above by letting N + o or directly using (2-5). Taking the limit as N + 0 
in (3-2) we have for the generating function 


9x = (2*—ut) +s uf. (3°12) 


The last term in the right-hand side of (3-2) tends to 0 as N + oo since u, < |z| < u, and 
u, < 1, ug > 1 for Res > 0. The probabilities in this case are 


t 
P, o(t) = Kp" | T1e-* I.dr, (3-13) 
0 
P,.,(t) = Pte“ _n—L+x} (r = 1, 2, 3, aeuds (3-14) 
with expectation 
t 
E(r |k,t) =k+(A—p)t—(A-p) mp-*| t1(t —7) e~*7 I,.dr. (3-15) 
0 
These three equations yields incidentally the relations 
co t 
D An — Inn} = Peet —k I T-1 et—) I. dr (3-16) 
n=1 0 
and, by putting A = yp, 
¥ n{I,_~(at) —I,4,(at)} = ket (k = 1,2,...). (3.17) 
n=1 


Asymptotic results are well known (see Feller, loc. cit.) and are 


_ {(MJAye if ABy. ; 
Fy,o() =), en (3°18) 
P,, (0) =0 (r=1,2,...). (3-19) 


+o if A>uy, 
E(r|k,oo) = +k if A=yp, (3-20) 
0 if A<y. 
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Probability densities of first passage times through a barrier are given by differentiating 
the appropriate absorption probability. For example, in the single barrier case, the prob- 
ability density of the first pr ssage time through the origin is, from (3-13), 

p,(t) dt = kp-*t-1 ent I,dt. (3-21) 

Passage to the ‘diffusion process’ limit in the case of a single absorbing barrier at O yields 

Mt) — piby (a0 + 109292) +11 5a0 +1662 ay ) a? 
w(O,s,y) = eY (s —1a0 + $076?) + [1/8 — (s —ia0 + 40°60") exp — a sta ; 
(3-22) 
The second term in the right-hand side of (3-22) is the Laplace—Fourier transform of the 


probability of absorption. Inverting and differentiating with respect to t we obtain the 
probability density p of the first passage time at the origin 


y y> at ay 
a ad 1 if a<0, ; 
Note that I, p(t, y) dt = seme os a (3-24) 
Let f(x, t) be the density obtained by setting y = 0 in (2-11); then the probability density 


in the absence of absorption is 
rt 
Sle lyst) = fole—y.t)— | fola,t—7) elt, y)ar. (3-25) 
Set NA = L, a constant. A more symmetric form of the transform w(0, s, y) is obtained 
in the case of two absorbing barriers by translating the origin of x and y to 3L, 


— dp) e—aulo* 
Cee {ei?+alo BL sinh w(y + $L) —e~W0+aleAL sinh w(y — 4L)}, 


(3-26) 


w(9,s,y) = yeu+ 


1 a 
h = es 2 vas = —i, 
where y = (s—ia0 + 4070) and w =| (28+ =a) 


The first passage time probability densities at 4 and —4L are obtained by inverse trans- 
formation of the terms involving e+” in the right-hand side of (3-26). One finds 


~ 








exp| -S(y-4) +405] 
p,(t,y)dt = Jano) pt, amt HL —yl 
1 
es L—y}* \d 
xexp| —scgl2n+4)L—yh | at, a 
exp |-5 {(y+4$L)+ sat} i 
p_(t,y)dt = Tana) ne cent NL tyl 
x exp | - a4 Ont) L-+yP | i 





where p, and p_are respectively the first passage time distributions at 4 and —4L. The 
probability density in the absence of absorption has an expression similar to (3-23), 


t 
fle lyst) =fole—yst)— [Jace t—1)[p,(r.y) +p_(r,y)ldr. (3-28) 
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4. THE RANDOM WALK WITH REFLECTING BARRIERS 


We now take the barriers at O and N to be reflecting ones, in which case the boundary 
conditions satisfied by (2-4) are 


(8+A) go =AQit+1, (8+) In = Kyi t+2. (4-1) 
Solving for the constants A and B as in §3 we obtain the solution 
= P{X, (2% — 2441) +24 + ¥,(1 — 271}, (4:2) 
where X= Aa ef) — ye) fe) aa) 


Yj, = As~*(tuy tg) #4 {(ugl9-¥ — uy t3-*) — (ugl—* — u—*)} (ug! — ugh). 
(4-2) may be written as 
Oe = Pet + (1-2-3) wh(1 — wy) + OX yfe¥ — 2944 (1-2 aH} 
> d{zk+ (1-2) uk(1-u,)7} as N>o, (4:3) 


(4-3) is thus the solution for a single reflecting barrier at the origin, a result which may be 
checked by direct methods. 
Expanding (4-2) and inverting the coefficient of 2” yields for the probabilities: 


P,, -(t) = Br-* e-* I,,_,, + Br-* one agsvesn+K0—r + Loginwsn-K-r} 
wo ft 
+ noes | e-*" {To. yiwi-)-k-r—2 — 28 Tog 4wad—k-r—1 + Beg -pwd—k-+ 
+ Toni v+kerse ig 28-Ussn D+k+r4+1 + BP loavenirt dr (r = 0, 1, 2, ees N). (4-4) 
In order to lighten the algebra, we give the expected value for the special case k = 0 only; 
co t 
E(r|0,t) = (A—p)t+A > | 1r(t—7) e-*7 ©, (7) dr, (4:5) 
j=0J0 
where ®,(7) = B-{(2jN — 1) Tosn—1 —[2(j7+1)N +1) Ty54.) Natt 
+ B*{2(j + 1) Nags — 27 N Ley} + BO P{[(29 + 1) N + 1) Lajas 


—{(2j +1) N — 1) [a5 n-a}- 
Asymptotic results are most easily obtained by applying (1-4) to (4-2). Thus the generating 
function of the stationary distribution is 


(A=) 2 — (W/Ay*¥3} 





—r «60 CA Sp, 
lim sg,(z,8) = lim G;,(z,t) = (@z—H) 1-H) ay (4:6) 
s—>+0 t> (—"") if A= 
(1-2) (N +1) a, 
and hence (A-p)(w/AN* 
if A+4, 
lim P, ,(t) = 4 A{1 = (H#/A)87} , (4-7) 
t>o 
W(N+1) if A=p. 
The expectation of r in the stationary distribution is 
7 Se — (HAY) 
N- 1—(w/A)\+}" if A : 
E(r, 00) = ——_ {1 —(u/A)X*} 1 + (4:8) 


4N if A=n. 
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In the context of queueing theory the random walk considered above is the single server 
queue problem with the restriction that the size of the queue is < N. In this case newarrivals 
do not join the queue if there are already N — 1 ‘customers’ waiting for service. The prob- 
ability P,, ,(t) given by (4-4) is then the probability of having r ‘customers’ in the system at 
time t (including the one being served) conditional on k ‘customers’ at time zero. We list 
below the Laplace transform of some formulae of interest. Inversion of these results in- 
volves simple but tedious calculations which we do not carry out. 

The expected length of the waiting line L at time t, conditional on initial length k—1 is 


E(L|k—-1,t) = E(r| k,t) +P, o(t)—1 


and has the Laplace transform 


k N vet | —1 ale 
PRL |k-1,1)] = (A—p)s +5414 Y, -X +4 aes (a — 1) Ye} 
2s" “1 


> 


(4-9) 
and, by (1-4), 


E(L | k~1, 00) = {N {1 = (w/A)%}(1—W/AY} A — (WA. 


The distribution of the duration of a busy period, p,(t), is obtained by considering the 
random walk with an absorbing barrier at the origin; i.e. with the boundary conditions 


Jo = 1/8, (8+) 9n = HOy-1t2*. (4-10) 


Proceeding as before the solution of (2-4) with these boundary conditions is 


Oe = 8%, + ${(eX —2NH) W +21), (4-11) 
where Vj. = {wi ud! (uw, — 1) + wl wh(1 — u,)} {ud (u,— 1) + ul (1 —u,)} 9 
and Wi, = (uk — uk) {u!(u,—1) + u¥(1—u,)}-. 


The probability that the particle will be absorbed at the origin before time t, conditional on 

initial state k, is #— [V/s]. Thus the probability density, p,(t), of the duration of a busy 
riod has the Laplace transform 

ee ee ee lp, (t)] = V(s). (4-12) 


The formulae are simplified when NV -> oo and we have the familiar single server queue 
problem in which no limit is placed on queue length. The appropriate generating function 
is given by (4:3), and the transition probabilities by — 


t 
Pat) = PH ea g AA | 69 (Pee 2B asesatlesnsah dr} (= 0512)... 
(4-13) 


(4:13) is identical with the result given by Ledermann & Reuter (1954, p. 366), obtained 
by the use of spectral theory*. The expected value of r is 


o t 
E(r |k,t) =k+(A—p)t+ > BOO (k +1 +i 71 e-*7 Ts. sdt. (4-14) 
j=0 0 


* This result has also been obtained by the following authors: Bailey (1954), by solving the forward 
equations using a generating function technique; Clarke (1956), by solving an integral equation of 
Volterra type; Champernowne (1956), by a direct probability argument; and Conolly (1958), by an 
argument similar to ours. We are indebted to the referee for the last three of these references. 
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The expected length of the waiting line can be written down from (4-13) and (4:14). The 
probability density of the duration of a busy period, p;(t), is given by differentiating (3-13), 


thus p,(t) dt = kB-*t- e— I,,dt. (4:15) 


The ergodic properties of this process have been investigated in detail by Kendall (1951). 
In order to obtain the ‘diffusion process’ limit, set again Nh = L and translate the origin 
of x and y to 4L. The reflecting boundary condition is seen to correspond to df/ey = 0 at 
+ 41, and one finds the transform w(0, s, y) of f(x | y, t) (using the same notation as in (3-26)) 
tO eal? 


w(O, 8, y) = Vy — ee {e(i+alo AL cosh w(y + 4L) + e-+4/04L cosh w(y — $L)}. 
(416) 


5. QuEUES witH N SERVERS 


The exponential arrival and service time queue with N servers (Feller, 1957, p. 415) is 
defined in our notation by the equations 


(StA+kpt) 9p =AGnsatHeegpat™® (k=9,1,...,N), (5:1) 
(StA+Np) 9p =AGIpsrtHNGQaAte (k= N,N+1,...). (5-2) 


Before giving the solution g,(z,s) of the system (5-1) and (5-2) we mention briefly the 
generalization of (1-3) to the case where the coefficients of the difference equation are linear 
functions of k; 


{h(a + by) +8 + dg + bo} gy = (hay +9) Jpesa + (dy +09) Gna +2. (5°3) 


The homogeneous equation obtained from (5-3) is the well-known hypergeometric differ- 
ence equation, and its solutions have been studied in detail by several authors (e.g. Batchel- 
der, 1927). In fact there are twenty-four such solutions corresponding to the twenty-four 
solutions of the hypergeometric differential equation (Batchelder, 1927, p. 101), and we 
may choose two of these appropriate to our problem, say /;,(s) and h,(s). A particular solu- 
tion of the non-homogeneous equation may be found by taking the Laplace transform oi 
the generating function G;,(z, t) of the original birth and death process. G,(z, t) can be found 
readily by standard methods and has the convoluted binomial form 


€i(at) = poor fPleO" 1) + 26a, —Dy eee) FH 
= a,—b, 


a, &-bpt_h +z2a,(1— e(4,—byt)) ao/a,—k 

( 1 ) 1( ) 

——aa ‘ania: . (5-4) 
bes | 

The general solution of (5-3) is then 


Ix(2,8) = Wx(2, 8) + A(z, 8) f,(8) + Blz, 8) h,(8), (5°5) 
where yf,(z, 8) = L[G,(z, t)]. As before A and B may be chosen to satisfy given boundary 
conditions. 

There are several interesting variants of (5-3). For example, a linear birth and death 
process with a) = 0 = by and reflecting barriers at k = N,, N,; N, > N,, could be used as a 
linearized model to investigate the logistic process (Kendall, 1949). The essential feature of 
Kendall’s logistic model is that the coefficients are quadratic in k, so that the states N, and 
N, are natural reflecting barriers. Explicit solutions can be obtained for the linear boundary 
value problem and could prove useful as an approximation to the quadratic case. 
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Since (5-1) and (5-2) are special cases of (5-3) their solutions are, respectively, 
9x(2, 8) = W,(z,8)+A(z,s8)v,(s) (k= 0,1,...,N), (5-6) 
9x(2, 8) = 2*b(z, 8) + B(z,s)uk(s) (k= N,N +1.,...). (5-7) 
u,(s) and ¢(z, ) are the same as in previous paragraphs with N substituted for y; 


Welz, 8) = L{{1 — (1-2) e“"Jt exp [ — (A/q) (1-2) (1-e-“)} 





= 100-0 (EM peu sjolneie i Qa. — (68) 
j=0 \J Id 


1 
v,(8) = f erat ek] — x)s/¢-1 da: 


= Bk +1,s/n) ,F(k+1; k+1+s/y; —Aly), (5-9) 
where B(m, n) and ,F,(m; n; x) are the Beta and confluent hypergeometric functions, respec- 


tively. The boundary at k = 0 being a natural one, A and B are determined by the equation 
for gy(z, 8), yielding the solution 


pe {N(z—-uy) 6+ U4 Ywa— Vy}. (k = 0,1 
’ Un — Uy0N-1 7 





Vx mee |S 


9x(2, 8) a 





ak + uftt Nf2N-l(zvy_ — Vy) + Uv Vyas — Yn Vy}. (k = N,N +1,....) 
Un — Uy Uy-1 
(5-10) 
The Laplace transform of the expected number of ‘customers’ in the system at time ¢, 
given k initially, is 




















L{E(r | k,t)] = 
wk—-rX . (N-A), 1 _ (1-4) (N—Alp) +m), 
Mus) + e+ a) Soe A(uU,— 1)? S+y } 
(k = 0,...,1,NY), 
| pew), « (AA) )_ aN -1-Alu)oy—W—Ny)oyg\f 
(A-p TE) NUN-1— UN) — QUE = tT A) ON TN NL) ON 
mt 8 +8, a v8 8(8 +p) i 
(k= N,N +1,...), 
where Ry, = Y% [Vy — Uy Py_- 3, 


S, = uft!Y[oy — Wy vy). 
The stationary distribution, which exists for ~N > A, may be found directly from (5-10) 
using (1-4). The calculations involved in taking the limit are considerably simplified by 


using the following recurrence relation for confluent hypergeometric functions (Sneddon, 


1956) (a/y) Fy(a+1; y+1; x) = F(e+1; y; 2) — F(a; y; 2). (5-12) 


Using (5-12) repeatedly in the terms u,jy_;—Wy and vy—u,vy_,, and expanding as a 
power series, we have 


lim sg,(z,s) = lim G;,(z, t) 
s—>+0 t>a 





_pfe AeY.,. = As ‘; ; 
=P, = —* + 2 wine ; BN >A, (5-13) 


26 Biom. 46 
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_fS' Ale, S Alay” \" 
where Py = {= a +z WN rad . 


(5-13) agrees with the result given by Feller (1957, p. 415). 
The barrier at the origin of the system (5-1) and (5-2) is a natural reflecting one. By 
imposing an absorbing barrier at the origin 


Jo = 1/8 (5-14) 
and solving (5-1), (5-2), for k > 0, and (5-14) for k = 0, we can find the probability density | 


p,(t), of the first passage time through the origin, i.e. the density of the duration of a busy 
period. Say this solution is g;, = g,(z, 8), k > 1. Then . 


dP, 
pu(t) = =H — Y-rfo9,(0, 8}. (5°15) 








[Note that I ” py(t) at = 1 if and only if ~N > A.] The moments of the distribution p,/(t) 
0 


may be found directly from (5-15) using the following elementary properties of the Laplace | 
transform: If #{f(t)} = g(s) then 


; d” \ 
Berge} = (-1y 











ds” ’ 

; (5-16) | 
and #\ [fe ar| = s—19(s). 

0 
Using (5-16) and (1-4) the expected value of t” conditional on k, E(t" | k), is 

t n 
E(t" | k) = im | T"p,(7) dt = {(- ip (sg;,.(0, °)} . (5:17) 
t>oJ0 ds” s=0 


The solution of this absorbing barrier problem follows the same lines as before, with the 
modification that we require the general solution of (5-1), k > 0, with two arbitrary con- 
stants. It is easily verified that a second solution of the homogeneous equation obtained 
from (5-1) is 





k (k aie < 
w,(s) = & ( ) (u/AP T(j +8/H). (5-18) 
j=0\) 
The general solution of (5-1), k = 1, 2,..., N, is then | 
9.(2,8) = W,(2, 8) + A(z, 8) v,(8) + C(z, 8) w;,(8), (5:19) | 


where y,, and v, are given by (5-8) and (5-9) respectively. C can be eliminated by sub- 
stituting (5-19) in (5-14) to give 


Ik = {Wy + 8(WoW a — Wy Wo)} (8Wo)* + A (Wo, — Wy Up) WOT. (5-20) 


Elimination of the constants A and B from (5-20) and (5-6) gives the solution g,(z,s). We | 
give the expression for the required quantity g,(0,s) for k > N only: | 


UEtIN{(wyVy — Wy Up) (Wy_1 + So y-1 —SWy_1 Vo) 
— (Wo Vy_-1 — Wy-1%) (Wy + 8 Py — SWy Yo)} (5-21) 

8Wo{ (Wy Vy — Wy %) — Uy(WoVy_1 — Wy_-1%)} ’ 
where the function y/(z, s) appearing in (5-21) has been evaluated at z = 0. | 
Application of (5-17) to (5-21) yields the moments of the duration of a busy period for | 
k > N. Results for 1 < k < N may of course be obtained by the same method. 





9;.(9, 8) = 





-16) 
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NOTE ON THE COMPARISON OF SEVERAL REALIZATIONS 
OF A MARKOFF CHAIN 


By J. H. DARWIN 
Department of Scientific and Industrial Research, Wellington, New Zealand 


1. INTRODUCTION AND SUMMARY 


Consider a probability chain defined by X(t) where ¢ takes in succession the values f,, fs, ..., 
where t;,,—t; = 7,7 > 0. Suppose the chain can be in any of s states at any time ¢;. We 
describe the event ‘the chain is in state 7 when ¢t = t,;’ by saying X(t;) = 7. Then the chain 
is called u-dependent if Prob (X(¢;) = j) is a function of the wu values 


X(t;-—7), X(t;-—27), ..., X(t;—wuz7). 


We suppose this function is independent of 7. 

A great deal of research into u-dependent chains of this kind has been carried out in 
recent years. Bartlett (1951) discussed the asymptotic distribution theory for functions 
of long realizations from such chains, that is for realizations consisting of X(t,), ..., X(ty) 
when N is large. Following this preliminary work, writers such as Hoel (1954), Whittle 
(1955), Good (1955), Anderson & Goodman (1957) and Goodman (1958) have developed 
asymptotic tests of significance which, for instance, compare the maximum likelihood 
(m.l) estimates of the probability {X(t;+7r7) =j| any set X(¢;+(r—1)7), ..., X(¢;)} with a 
given set of values of these probabilities, which examine the length wu of dependence of the 
chain, which are suitable when s is large, etc. 

In this note we discuss a one-dependent or Markoff chain with s.states. We suppose 
{X(t,) = k | X(t;—7) = j} = pj. We shall refer to p,,, as a transition probability. We suppose 
also there is a unique occupation probability P; by which is meant P{X(ty) = j} tends to 
P; as N tends to infinity. We suppose P; and p,,, exist for all j and k. 

We develop: 

(a) A test of the equality of r sets of transition probabilities p,,. This test follows auto- 
matically from Bartlett’s work. 

(b} A test of the equality of r sets of transition probabilities p;; separated from a test of 
the equality of r sets p,,./(1—~p,;), 7 + &. This division of the test (a) is useful when 7 is small, 
in that it then throws light on the reasons for the differences in occupation times of the 
8 states from set to set. 

(c) A test of the equality of r sets of occupation probabilities P;. This is of interest when 
T is not small. The test criterion is difficult to calculate on a desk machine, but not on most 
modern electronic computers. We show, when s = 2, the direction of inadequacy of the 
assumption of independence of X(t;) and X(¢;+7) when X(¢;) is a Markoff chain, and a com- 
parison is being made of the occupation numbers for two long realizations. 


2. THE COMPARISON OF 7 SETS OF TRANSITION PROBABILITIES 
Bartlett (1951) showed that: 


(i) ”,, the number of values of i for which X(¢;) = j, X(t; +7) = k or, say, the number of | 
transitions from state j to state k is for a long realization approximately normally dis- | 
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tributed with mean NP;p,, and variance NP;p;,(1—p;,), where N is the length of the 
realization; cov (Nj,,,%_) is —NP;p4,p,. We suppose all P; are greater than 0; 
(ii) if n;, = Xn, n;, is for large N distributed approximately normally with mean 
k 


NP, and variance proportional to N, so that n,_/(NP;) tends to one in probability; 

(iii) the maximum likelihood estimate of p,,,n,,/n;, is for large N uncorrelated with 
Mm |, if j + L. 

It follows from (i) that for large N the s values n,,, k = 1,...,8 have a multinomial 
distribution with index NP; and parameters p,,, k = 1, ...,8; from (iii) that these s multi- 
nomials are independent of each other; and from (ii) and a theorem of Cramér’s (1946, 
p. 254) on the replacement of parameters by quantities which tend to them in probability, 
that in the multinomial for the s values n,,, k = 1, ...,s we may replace NV P; by n;.. 

We consider testing the hypothesis that r sets of values p,, are the same, that is that pj, 
where / = 1, ...,7, is independent of / for every combination j, k. The data are r long realiza- 
tions one from each of the respective chains. We are on very familiar ground because the 
preceding paragraph implies that the test must be related to a combination of tests of 
s rx scontingency tables, each table corresponding to a value j. The likelihood ratio criterion 
(L.r.c.) to test the hypothesis is 


2[Ln jy log, N44 — Un; log. n; ;— Unj, log. ny, + Xn; log, n;,,]. (1) 

In this and later formulae a dot in place of a suffix means summation has been carried out 

over the replaced variable. Thus, for example, n; = ¥ ,,,;. The remaining summation signs 
k,l 


indicate that summation is to be carried out over all possible values of the remaining suffix 
variables: n; ~ ”_;, with equality for all j, if the last state is the same as the first. 

It is easily proved by a standard expansion of (nj.—; 1 Pjj4)/\/n;,, that to first order in 
the n; , this criterion has the same value as the sum of.s y?-type criteria for s r xs con- 
tingency tables. The distribution of the criterion (1) therefore is that of y? with s(r — 1) (s — 1) 
degrees of freedom. 


3. THE COMPARISON OF TRANSITION PROBABILITIES FOR SMALL T 


Suppose the chain is the result of observation of a process defined in continuous time. We 
now take the parameter ¢ of § 1 to refer to time. If 7 is decreased it may happen that the 
length of dependence of the resulting chain is increased or that any type of u-dependence 
disappears. We consider only chains which are still Markoff chains for such low values of 7 
that it is reasonable to suppose no change of state has been missed—that is that there has 
been at most one change of state between times ¢; and t; +7. 

That such a case may occur in practice is shown by the example in educational theory that 
prompted the work of this paper (see § 5 below). An educator defines a number of states in 
which a class may be: teacher giving formal instruction, pupil asking a question, etc. This 
leads to a chain, approximately Markovian, even for lew 7. If 7 is smaller than the average 
amount of time spent in the state in which the chain stays for the shortest time it is unlikely 
that any change of state will have been missed. 

It then becomes possible to make a simple interpretation of comparisons between sets 
of values p;;. We suppose now 


{X(t;+7) = 7 = X(t,+27) =... = X(t;+(n—1)7); X(t; +07) = k, | X(t;) = ky} 
= (1—py,5) Vn,3 P77 (1 — Dj) jug (2) 
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Then (1—p,;) v;, plays the role of p;,, j + k, in the Markoff chain originally considered, and 
the use of v;, amounts to a change of parameters for that chain. The quantity v,, is a con- 
ditional probability, the probability that if the chain has moved from state j it has moved to 
state k. The probability that the chain stays in state j for n — 1 further observations once it 
has reached j is (l1—p,,;) p31. This geometric distribution can then be described as the 
distribution of time spent in state j. It corresponds to an exponential distribution 
exp (—A,t) A,dt of time spent in state j for the corresponding process defined in continuous 
time. 

It is commonly desired to compare the occupation times (by the occupation time of 
statej we mean the number of times ¢ the state 7 was observed) for two or more long realiza- 
tions. For chains of independent observations this comparison is made by an ordinary 
contingency table analysis. For values of 7 not small enough to enable us to make the special 
assumptions of this section, the nearest equivalent is a test comparing values of occupation 
probabilities P;. We discuss such a test in § 4. When the assumptions of this section hold, the 
test of § 4 and the test of § 2 can be supplemented by a separate test of the equality of the 
sets p,,. Such a test will be useful for instance when two sets of occupation times are about 
equal, but for one realization there has been a rapid fluctuation from state to state, while 
for the other there has been a slower rate of change with longer periods spent in each state. 
The likelihood ratio criterion to test the hypothesis that r sets v,;,,, 1 = 1,...,7, are equal 
regardless of the values of the sets p,, is 


LS My log, 2544 — U(;..— Nip) log, (m5 1 — Nj) 


— nik. log, Nj, + XU(n;,,— 43, log, (n;,,—m5;.)]- (3) 


This is of the same form as (1) with n; ,— nj; replacing n, ,. In considering (1) we were able 
to suppose the n,,;, k = 1,...,8 had a multinomial distribution with index n,_, for large N,. 
If now we consider n,,, fixed in this multinomial distribution the conditional distribution of 
the n,., j +k, for large N, is again multinomial with index n,; ,—n,, and parameters 
Px1/(1—p;q). It follows as in §2 by analogy with the analysis of an r x (s — 1) contingency 
table that the distribution of (3) for large n; ,—n,, is x* with s(r— 1) (s — 2) degrees of free- 
dom. Since this is independent of the n, ,—1,, provided they are large, it is the general 
asymptotic distribution of (3). These degrees of freedom virtually follow from the fact that 
the s(s — 1) values p;,)/(1—p;,), j + &, have one linear restriction for each value of j. 

The likelihood ratio criterion to test the hypothesis that r sets p;., 1 = 1, ...,7, are equal 
regardless of the value of the sets v;,; is 


2[2(n; —Njx) log, (15,1. — 5p) + N55 log, Ni =n; log, N31 
—X(n;,.—n;;,) log, (n;,,—n,;,) — Ln;;_ log, Nj, + Ln; log, n;..). (4) 


For large n; _, this latter criterion is equivalent to the sum of x* criteria from s r x 2 tables, 
since for given n, ,, 2; is a binomial variable with index n,; , and parameter p,,, this dis- 
tribution being independent of the distributions of n,,., k +7 for large Nj, Nj. The two 
criteria (3) and (4) add to give the criterion (1). In the author’s view the exact additivity of 
likelihood ratio criteria over complementary hypotheses of the main hypothesis that r sets 
Pj are equal, is an excellent argument for using them rather than the x? type criteria to 
which they are equivalent for long realizations. These latter criteria annoyingly do not 
accurately possess this additivity. 
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For completeness we give the test of the equality of sets A; 1 = 1, ...,r of the continuous 
process when 7’, is the amount of time the /th realization passes in state j. The likelihood 
ratio criterion is 


2X(n;.1— Nj) log, (Nj .1.— Mjq) — U(N;_,—Nj;_) log, (ny, — j5. 
—2(n; 1 — Nj) log, Ty + X(n;_.—n,;,) log, T;]. (5) 
This is, for large n; ,—n,, distributed as x? with s(r— 1) degrees of freedom. 


4. THE COMPARISON OF 7 SETS OF OCCUPATION PROBABILITIES 


If for general Markoff chains a test of the equality of r sets p,,;, 1 = 1, ...,7, had shown no 
over-all significant differences between the sets it might be considered illogical to test 
differences between occupation probabilities P,, since these can be expressed in terms of 
the p;,. It is, however, conceivable that the sets p,,, may differ on the whole so slightly that 
the criterion (1) is found to be not significant, but that the particular functions of them 
represented by the P;, may show important differences from one chain to another. 
Accordingly, wedevelop an asymptotic test of the differences between two sets of occupation 
probabilities P,, starting from the fact (Bartlett, 1951) that the n; are normally distributed 
with means NP; for one realization. We then give the obvious extension to r sets P). 

We begin by finding the covariance matrix of the n,; for a single long realization of length 
N from a chain with transition probabilities p,,. Patankar (1954) has shown that 


N-1 
varn; ~ > 2(N—v) P+ NP, —N?P3, 
v=1 
N-1 N-1 (6) 
cov (m;_,;,,) ~ D (N—v) PY + > (N —v) Py} — N°P; P,, 

v=1 v=1 
where P) is the stable probability of the chain being in state j when ¢ = t; and in state k 
when ¢ = ¢;+v7. Suppose the matrix of transition probabilities p,, is called 7’. We shall 
consider that the roots of T are distinct. We are already assuming that the only root of modulus 


one is one. Then 7' has a spectral resolution (see, for example, Bartlett, 1955, p. 26) 


T= X+ YD AP(A), (7) 
A+1 
where X is (: 40 :) 
P, P, oer P, 


Ais a root of 7 and P(A) is a matrix p;,(A) of order s x s. The quantity Y P(N —v) is then 
7 

expressible N-1 

oc PNW 1) P.+ EE (—0)A*p nl). (8) 

C= 
Since A is less than one (Fréchet, 1938, p. 105) this is asymptotic for large NV to 
IN -HNH B+ EES eel —A. (9) 
Introducing this into (6), we may write 
varn, ~ N(2P; x (j,j)th element of (I— 7 + X)-!— P;—P}) 

cov (n; ,n,.) ~ N(P; x (j, k)th element of (J—7' +X) 

+ PB, x (k,j)th element of (J -—7'+ X)+— P,P,). 
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We have replaced > p,,(A)/(1—A) by (I—7'+ X)-; this replacement is permissible since in 
this sum A is less rol one. If we write D for the diagonal matrix, diag (P,, ..., P,), we may 
write for the covariance matrix of the n,_, 

N(DI-T7+X)"+(DI-74+X)) —D(I+X)}. (11) 


The matrix (11) is singular since Xn; = N. We suppose the first s—1 of the n; have a non- 
singular distribution which we write as 


ae ee 
(Qn apo? 2 (¥ R)N A(% ») | st i oat ”) 
where the row vector n, /N —P,,...,n,_1,,/N —P,_, is written (n,; /N —P;)'. Then, under 
the assumption of a common known A for the two chains being compared, it easily follows 
that the likelihood ratio interior to test the hypothesis that the set P; for the realization 


with occupation numbers 7, , ...,%,_;,, is equal to the set P; for the realization with occupa- 
tion numbers m, , ...,M,_,_, is 

MN (m;. n;\' ,(m;. 1; 

way *) (3 ) om 


The similar criterion for testing a single set of occupation probabilities against a hypo- 
thetical set was suggested by P. 8. Neal (see Patankar, 1954). The joint frequency function 
of the m; and the n;, may be expressed as 


—MN_ (m;,_ 15)" 4 (™3,_ 1%. 
const. exp Ez +N) (5 -%) A (3 -"2) 


a m™s._ p\\ 4(n(™-— ("2-2)) 

sare (8-2) +0 (54-2) 4((H-B)+4(GE-2)) |. 09 

The non-singular transformation 
z,;=n; |/N—P,—(m,;/M—FP,), 
w,; = N(n,;,|N —P;)+M(m,;_/M—P)), 





enables the criterion (13) to be expressed purely in terms of z;. The exponent in (14) is in 
fact the sum of two quadratics, one in z; of rank s— 1 equal to minus half the criterion (13), 
the other in w; also of rank s—1. It follows that the criterion (13) is asymptotically dis- 
tributed as y* with s— 1 degrees of freedom when A is known. 

However, since A is not known it has to be replaced by an estimate. A is a function of the 
Pj. We replace p,, by the pooled likelihood estimate (mj, +,,)/(m;,+,;,). Under the null 
hypothesis of equal sets p,,, this estimate is for large N + M distributed normally with mean 
P;, and variance proportional to 1/(N + M). It follows that the estimate tends in probability 
to p;,. Since A is a finite function of the s* values p,,, A(p,,,) also tends in probability to 
its true value as M + N becomes large. By a theorem of Cramér (1946, p. 254) this replace- 
ment of A by such an estimate will not affect the limiting distribution of the criterion (13). 

Until recently it would have been of purely academic interest to suggest testing a 
criterion the calculation of which even by a computer using a desk machine is extremely 
laborious. But with the growth in numbers of electronic computers, most of which have 
inversion and multiplication of matrices in their libraries, calculation of the criterion (13) 
will often be feasible and cheap (the impetus to provide a test of occupation was in fact 
provided by an organization with access to a high speed computer). 
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Examination of the criterion (13) for a small number s of states throws light on the 
behaviour of the y* test of a contingency table of occupation numbers when the chain is 
one-dependent and not zero-dependent as required for the validity of such a test. We discuss 
the simplest example s = 2. 


4-1. Two states; s = 2 
T is now om nv) and X is (2 a 
P21 P2,2 P, PJ’ 
where P, = po1/(P1,2+P2,1) and P, = p, o/(P1,2+ Pe,1)- 
\[-7'+ X| = p,.5+po,; and if the matrix (11) is called VN 
aoe P,P, 1,14 Po,2) ( a i) 
Pret P21 athe 1 
MN(n;_|N —m;.|M)? (p1,2+ 2,1) 
(M+ .N) Pi P(P1,1+P2,2) 
is asymptotically distributed as x? with 1 degree of freedom. 
In calculating this criterion we use the maximum likelihood estimates 
P1,2 = (My,2+%,2)/(My,,+%,.), Poy = (Me,14+Ny,1)/(My,, + No,_), 
Pra = (My,4+y,1)/(™M,,+,,) and Pop = (Ms,9+ No, 2)/(Mes,,+Ne,.). 
Since m, > ~ Mm ,, and n; . ~ No, we may substitute 
(my,.+7,)/(M+N) and (ms +n )/(M+N) 


for P, and P, respectively, where M = m,+m,,, and N = n+. We may compare 
this criterion (16) with the usual criterion for the 2 x 2 table 





It follows that (16) 





my,. Me. 
N4,. Ne 








2° 





(M +N) (n,m, —%,.My,.)” 


This latter criterion is : 
MN (my, +7,,,) (Mo, +Nz,.) 





(17) 


If P, and P, in (16) are replaced by their maximum likelihood estimates, the only difference 
between the criteria is the extra factor (p, + »,1)/(1,1 + P2,) in (16). Thus if the chain were 
one-dependent but the x? criterion for a 2 x 2 table were used as though there were zero- 
dependence, the expected value of the criterion would not be one but (p, + P2,2)/(P1,2 + P2,1)- 
This factor may seriously disturb the validity of the deductions drawn. For instance in the 
educational example referred to in § 3, the educator may define only two states which may, 
for example, be teacher-initiated and pupil-initiated activity. An interval of time 7 between 
observations smaller than the average time spent in these states will probably mean that 
Py and po. dominate p,, and p,, respectively. Then in using x? one would run the risk of 
claiming more significance than is really there. 


4-2. The comparison of r sets of occupation probabilities. 


Suppose it is desired to test the hypothesis that r long realizations are of Markoff chains 
with the same set of occupation probabilities. Then by a simple extension of the argument 
of §4 we can show that the likelihood ratio criterion is 


E Nx} An — (EN) 2 Az = ¥ Nm —2) Al —2). (18) 
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In this the summations are taken over the r realizations for each of which there is an s x 1 
vector x, of quantities of the type n,; /N or m,_/M of § 4-1; N,is the length of the /th realiza- 
tion and ¥ = XN,a,/ZN. The form of A is given by (11)/N with the maximum likelihood 
estimate n,,_/n;,, substituted for p;,. The criterion (18) is distributed as x? with (r— 1) (s—1) 
degrees of freedom. 


5. EXAMPLE 


The only example on which the methods of this paper have been tried is provided by data 
arising from the researches of Dr Ned A. Flanders of the University of Minnesota. These 
data exhibit such extreme differences that a statistical analysis is scarcely needed. Never- 
theless, calculation of the criteria of the previous paragraphs reinforces the argument of 
§ 4-1 and also shows the usefulness of the tests described in §3 when they are applicable. 

Dr Flanders describes six different states in which a class may be, as follows: 

(1) Teacher accepts ideas, feelings and gives praise so as to integrate student behaviour 
in the problem solving process. 

(2) Teacher asks question. 

(3) Teacher lectures or gives his own opinions and ideas. 

(4) Teacher gives directions, criticism or justifies the use of his authority. 

(5) Student talks. 

(6) None of the above; including silence, confusion and administrative-routine state- 
ments made by the teacher. 

It is thought that the quality of the teacher is shown by the way in which his classroom 
switches from state to state as well as by the proportion of time the class spends in each state. 
Differences between teachers for particular transition probabilities may of course be tested. 
The comparison of complete sets of transition probabilities and occupation probabilities 
provides a numerical summary of teacher differences which, it is hoped, will be useful in 
providing a check on the usual subjective summary. 

The states in which two classrooms were in, were observed at short regular intervals. 
Tests of the length of dependence in a chain of such observations suggest that it is approxi- 
mately true to suppose the observations come from a Markoff chain so that the methods of 
the paper are applicable. The numbers of observations in the different states are large 
enough to inspire confidence in the use of our asymptotic tests. 

The data are, for the two classrooms: 














| 
State l 2 3 4 5 | 6 1 ie oe ee ee ee 
Re mee crim | | | | | 
| Ps 3 Bee... 
| 1 22 il 24 2 2 7 3 6 | 9 3) @|.04 
2 5 23 15 3, 42 6 1 9 | 3 12 27 | 5 
| 3 4 21 190 | 25 , 20 34 2 9 | 208 | 32 5 | 18 
4 0 2 14 56 14 | 28 0 14 32 | 108 | 40 | 40 
| 5 32 15 20 10 56 14 22 14 9 | 26 | 224 14 
6 5 22 | 31, 18 | 18 | 184 1 5 | 13 | 58 | 18 he | 
| | 





The criterion (1) to test the hypothesis that the two sets of transition probabilities are the 
same has the value 161-0 which on 30 degrees of freedom is very significant. This suggests 
that the test of the equality of occupation probabilities will be difficult to interpret. We 
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shall, nevertheless, calculate the required criterion in order to compare it with the criterion 
calculated on the assumption that successive observations in the classroom are statistically 
independent. 

The estimate P’ of the row vector of occupation probabilities is the row vector of quan- 
tities (m;,+n,;.)/(J2 +N). This is (0- 04746, 0-07387, 0-27789, 0-17025, 0-22309, 0-20743). 
This estimate | satisfies the equation P’'? = P’I to within one figure in the fifth decimal 
place, where P is equal to 7 with (mj,,+7,,)/(m;,+n,;,) for p,,. The estimate of the co-_ 
variance matrix of occupation numbers for a realization of length N is, as in (11), 


0-06711 0-01035 —0-03182 —0-03164 0-01934 —0-03334 
0-09545 —0-06514 —0-03412 0-03807 —0-04460 

ts 0-76613 —0-14075 —0-31241 —0-21599 
0-30988 —0-10108 —0-00227 

0-55254 —0-19644 

0-49266 


The matrix A required by (12) may be found by inverting, for example, the bottom right- 
hand 5 x 5 matrix in the above 6 x 6 matrix. The criterion (13) for testing the equality of the 
sets of occupation probabilities is then 60-1. The x? criterion for the 2 x 6 contingency table 
of values m; and n, is 126-0. Thus this example shows the great reduction that can occur 
in the criterion to test this hypothesis when independence of successive observations is 
not assumed and when p,; dominates p,,, 7 not equal to k. This situation for s = 2 was 
discussed in § 4-1. 

The assumptions of § 3 are known to be approximately true for data of this kind. Hence 
we may obtain further information on the factors affecting the great difference between the 
sets of occupation probabilities by considering the tests described in that paragraph. The 
criterion for testing the equality of the sets of conditional probabilities v,,, 7 not equal to 
k, has the value 95-0 on 24 degrees of freedom, and the criterion for testing the equality 
of the sets p;; has the value 66-0 on 6 degrees of freedom. Thus it appears that differences 
in the occupation probabilities are strongly affected by differences in the probabilities of 
change from one state to another, and more strongly (per degree of freedom involved) by 
the differences in the length of time spent in a state before a change occurs. 
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OVERFLOW AT A TRAFFIC LIGHT 


FRANK A. HAIGHT 


Institute of Transportation and Traffic Engineering, 
University of California, Los Angeles 


1. LIGHTS IN GENERAL 


Suppose vehicles of uniform length arrive at an intersection controlled by a traffic light, with 
arrival times which constitute a homogeneous Poisson process, with parameter A. The effect 
of acceleration will be ignored; once a vehicle is near the traffic light it travels with a uniform 
speed S unless stopped. When a queue of vehicles is stopped by the light, headway (distance 
separation between corresponding parts of adjacent vehicles) is constant, and when the 
cars move off the headway will be a larger constant, denoted by K. 

The letter # will be used to indicate the length of the red phase, so that the expected 
number of arrivals during a red phase will be Af for a fixed cycle light and AE(f) fora 
variable cycle light. During the green phase, when there is a queue of cars waiting, they are 
discharged with time separation K/S = T'. Once the queue present at the beginning of the 
green phase has been emptied, the Poisson arrivals continue through the intersection with- 
out delay for the remainder of the green phase. Instead of characterizing the green phase 
by its length, we will use instead an integer NV, which is the largest multiple of 7' contained 


in its length: length green =a =NT+0T (0<0< 1). 


Thus, with a sufficiently long queue, at most N vehicles can be discharged during a green 
phase. N is, however, by no means the maximum number of vehicles which may pass through 
the intersection during a green phase, for once the queue is dissipated, the Poisson stream 
may send cars through with time separation < 7. In some circumstances N will be a 
constant, and in others a stochastic variable governed by vehicle actuation on the side street. 

In speaking of input and output, we do not refer to input and output to the intersection, 
but only to the congestion at the intersection. Thus, when there is no queue, a car may come 
into the intersection and pass out of it, leaving both input and output zero. It is essential 
to specify under what circumstances input is possible. It might be assumed that S is as 
great as the speed of arriving vehicles, and so input would be zero as soon as the last queued 
vehicle began to move, provided that vehicle cleared the intersection in a single green 
phase. Such an assumption seems to conflict with our experience and certainly leads to an 
unwieldy formulation. Consequently we will assume continued input in every circumstance 
except zero queue length. 


2. FIXED CYCLE LIGHTS 


Suppose /, 7 and N are constants. We will have occasion to use the Poisson probability, 
which will be abbreviated 


P(u; A) = e-4A*/u! (ue = 0,1, 2, ...), 


and the Borel-Tanner probability [cf. Borel (1942), Tanner (1953), Haight & Breuer (1960)], 
which will be abbreviated 


R(u; r,p) = A(u,r) e~P¥ put 





(wu = r,r+1,...), 





where 
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where A(u,r) = ee yet, 

This represents the probability that exactly « members of a queue will be served before 
the queue first becomes empty, beginning with r members and with traffic intensity p, 
assuming Poisson input and regular discharge. In our application, during the green phase, 
p always equals AT’ [approximately aA/N], and will be omitted, writing R(u; r). 

The purpose of this paper is to compute the probability of Z cars being in the queue at 
the beginning of a red phase, when there were X cars in the queue at the beginning of the 
preceding green phase. Z will be called the overflow into the red phase. If X > N, the over- 
flow must be non-zero, and will depend only on the input during the green phase. Letting 


f(z; x) = Pr[Z =2|X =a], 
we have f(z; 2) = P(zg-—x+N;adA) (x4 > WN). (1) 


When x < N, z = 0, the formula is also fairly simple, and can be written in terms of cumula- 
tive Borel-Tanner probabilities as follows: 


N 
AO; 2) = ERGs2) (@< N). (2) 


In the remaining cases (z > 0, x < N), consider, for fixed x, the event that the overflow 
would be z, and that no vehicles would arrive after the light turned red in the period that it 
would take this overflow to clear. The probability of this is f(z; x)e-°*, and it may be 
regarded as the probability that the queue would first vanish after NV +z vehicles and that the 
overflow was z. The unconditional probability of the queue first clearing after N + z vehicles 
is of course R(N +2; x), and therefore the difference R(N +z; x) —f(z; x) e-?* is the sum from 
j= 1 toj7 = z—1 of the probability that the queue would first vanish after N +z vehicles 
and that the overflow is 7, that is 

z—1 
SMG; #) Res), 
j= 
from which we obtain 
z—1 
f(z; x) = ef? [R(N +2; x)— Y Rez; j)f(j;2)] (2 > 0,2 < N). (3) 
j=1 


Equations (1), (2) and (3) taken together give a formula for f(z; x) in terms of the preceding 
values f(j; x), 7 = 1,2,...,2—1, for various values of z and x. In conjunction with these 
expressions, it is necessary to use the following conventions: 


A. P(u;A)=90 for u< 90, 
B. R(u;r)=0 for u<r, 
C. R(u;0)=0 for u>O 

=1 for «=0. 


Then it will be noted that f(z; x) = 0 whenever z+ N < z, and this is in fact an impossible 
transition. The first two values of (3) are 


f(1; x) = e° R(N +1; 2), 
f(2; x) = e? [R(N + 2; x) —e-? R(N +1; x)]. 
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It is also possible to write each f(z; x) explicitly in terms of the Borel-Tanner coefficients 
A(z, 2). If we define B(z, 2) by means of the (z—x)th order determinant 








A(z,z—1) 1 0 0 
A(z,z—2) A(z—1,z-2) 1 0 
B(z,x) = (—1)**| A(z,z-3) A(z—1,2-3) A(z—2,z-3) 0 (4) 
: : : 1 
A(z, x) A(z—1,2) A(z—2, x) A(x+1,2) 
for z > x and B(z,z) = 1, then we can write f(z; x) in the form 
z 
f(z; ©) = Ler! p* Bez; j) RIN +j; x) 
j=1 
= e—PN pN+2-2S) B(z,j) A(N+j,2) (2 >0,%< N). (5) 


j=l 
With a table of A(z, x), the coefficients B(z, x) can be most conveniently calculated from the 


formula z-1 ’ ; 
Biz, x) = — Y A(z, j) Bj, 2). (6) 
j=2 


The first few values of A(z, x) and B(z, x) are given below; to avoid awkward fractions, each 
value is multiplied by (z— 1)! 


(z—1)! A(z, x) 
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N 2 | | | | | | | 
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To show that, for fixed x, Xf(z; x) = 1, consider 


S eM f(j; x) = SRN +5; 2) — Sf(j; x) (1—e-*), 
j=1 j=l j=1 


which is equivalent to Sf(j; z) = D RIN +5; 2). 
j=1 j=1 


Combining this with (2) gives the value unity provided x < N; in the opposite case the 
result is clear from (1). 

During the red phase, the transition probabilities are simple Poisson expressions. In 
terms of these and f(z; x), we can write equations involving other probabilities. For example, 
if p,, is the probability of a transition from z cars at the beginning of green to y cars at the 
beginning of the following green, we have 


Pry = Ply—x+N;(a+f)A) («> WN) 


ae (7) 
= Ply; AB) Y Rj; x) + x Ply 2; Ap) fle; x) (% < N). 


7=2 
Also, letting 7, = Pr(n cars waiting at beginning of green | ¢ = 00), 
o,, = Pr(n cars waiting at beginning of red |t = 00), 


we have (if these limiting probabilities exist) 


n+N ; 
on = ZS(ns jm; (8) 
and 1 = SP(n—j; AB) o;. (9) 
j=0 


3. SEMI-ACTUATED LIGHTS 


In this section we consider the traffic along a main street into a signalized intersection 
governed by a light (possibly) actuated by side street traffic. When the light turns green 
for the main street, it must remain green for a minimum fixed period (the main street mini- 
mum). After the expiration of this time the light will change whenever a car arrives on the 
side street. Then the light will be red on the main street for a fixed period (side street 
initial), and a further fixed period (side street extension). If any side street vehicles arrive 
during the first side street extension the main street light remains red for a further side 
street extension, measured from the time of arrival of the car. This process continues as long 
as cars arrive on the side street, or until a maximum value (side street maximum) is reached, 
when it turns green again. We will assume Poisson arrivals on both streets, and use the 
following notation: 


A = main street input parameter, 

/ = side street input parameter, 

a = length of side street initial, 

b = length of side street extension, 
N = number of cars able to clear during main street minimum, 
« = length of main street initial, 


A = length of side street maximum. 
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These quantities are fixed constants of the system. In addition, we need to define certain 
stochastic variables which will be used to characterize the system. 
£ = length of main street red. Defined over a+b < # < A, with density function g(u), 
n = number of cars able to pass through a main street green period. Defined over 
N,N+1,N+2,..., with probability distribution q,. For simplicity we will call n 
the number of ‘slots’ provided by the green phase. 
Once explicit forms are found for g(w) and q,, the problem of determining the various 
distributions of §2 is solved. For example, an equation like (3) would only need to be 
modified to take into account the stochastic nature of the number of slots. 


Ce) 2-1 
f(z; x) _—_ [R(n +z; x) — BRENIG 3 %)19n- 


In case the side street maximum is infinite, both of these distributions are easy to obtain. 

The red phase length distribution is given in a slightly different context by Raff (1951), 

and was apparently discovered by Garwood (1940). In the notation defined above, it is 
(u—b(1+%))} | [e(u—8(1 +%))1° 


h-1 
g(u) = hw em © ( —e—myt (xt (i—1)! + a! 





hb <u<(h+1)b (h=1,2,...). 


The slot distribution can be calculated directly from the Poisson probabilities and turns 
out to be q, =1-e-* (n=N) 

= (l—e-#T)e-“T@-D) (n> N), 
where a = NT. 


REFERENCES 


Boret, E. (1942). Sur l’emploi du théoréme de Bernoulli pour faciliter le calcul d’un infinité de 
coefficients. Application au probléme de l’attente & un guichet. C.R. Acad. Sci., Paris, 214, 452-6. 

Garwoop, F. (1940). An application of the theory of probability to the operation of vehicular- 
controlled traffic signals. J. R. Statist. Soc. (Suppl.), 7, 65-77. 

Haiaut, F. A. & Breuer, M. A. (1960). The Borel-Tanner distribution. (To appear.) 

Rarr, M.S. (1951). The distribution of blocks in an uncongested stream of automobile traffic. J. Amer. 
Statist. Ass. 46, 114-23. 

TANNER, J. C. (1953). A problem of interference between two queues. Biometrika, 38, 58-69. 





Pe ee 











In his 
tau (7 
have 
studi 
Kend 
(1948 
case 1 
reach 
one.’ 
partie 
when 
situat 
Th 
as is [ 
tion. 
for th 
as an 
parti 
partie 
that « 
and | 
Th 
gener 
devel 
signif 
in th 
hereir 
opera 
quite 


Parti 








‘in 


ns 


ver. 











[ 425 ] 


PARTIAL TESTS FOR PARTIAL TAUS* 


By LEO A. GOODMAN 
University of Chicago 


1. INTRODUCTION AND SUMMARY 


In his 1942 paper, Kendall introduced and discussed the partial rank correlation coefficient 
tau (7). Since then, the difficulties involved in developing tests of significance for partial 7 
have been discussed by Kendall (1948) and the sampling distribution of partial 7 has been 
studied by Hoeffding (1948) and Moran (1951). This work has been briefly summarized by 
Kendall (1955) as follows: ‘No tests of significance are yet known for partial 7....Hoeffding 
(1948) gives a rather complicated expression for the variance of partial 7 in the limiting 
case when n [the sample size] is large....Moran (1951) has considered partial 7 without 
reaching any clear conclusions other than that the distributional problem is a very complex 
one.’ In the present paper, tests of significance will be presented for a number of different 
partial correlation coefficients that are closely related to partial 7. These tests will be valid 
when the sample size is large and when certain additional assumptions are made about the 
situations under consideration (see § 2). 

The partial correlation coefficients introduced herein are computed from ranked data, 
as is partial 7, and they measure various aspects of what is usually meant by partial associa- 
tion. For the situations considered in the present paper, some justification can be found 
for the use of a weighted average of the partial correlation coefficients introduced herein 
as an over-all measure of partial association; less justification can be found for the use of 
partial 7, which is also a summary measure, in a certain general sense, related to these 
partial correlation coefficients. For some justification for the use of partial 7 in situations 
that differ somewhat from those considered herein, the reader is referred to Kendall (1942), 
and Hoeffding (1948). 

The partial correlation coefficients discussed herein are related to, but differ from, the 
generalized partial correlation coefficients introduced by Somers (1959). The methods 
developed in the present paper could, no doubt, be generalized in order to obtain tests of 
significance for various kinds of partial correlation coefficients. For the situations considered 
in the present paper, the use of the particular partial correlation coefficients discussed 
herein has some justification; these coefficients can be given simple probabilistic and 
operational interpretations, and the tests of significance developed for these coefficients are 
quite easy to apply. 


2. THE MEASUREMENT OF PARTIAL ASSOCIATION 


Partial association usually refers to the study of whether an association or correlation of 
quality Q with quality R is really due to the associations of each with a third quality P. 
Theories of partial association usually attempt to decide the matter by the consideration 
of subpopulations in which the variation of P is eliminated (see, for example, Kendall, 


* Research carried out at the Statistical Research Center, University of Chicago, under sponsorship 
of the Statistics Branch, Office of Naval Research. Reproduction in whole or in part is permitted for 
any purpose of the United States Government. 
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1955). The partial association between Q and R, when the variation of P is, in a certain sense, 
eliminated, will be studied herein in situations where the statistical analysis is to be based 
on the rank orders of the observed values of Q, of R, and of P. 

We shall follow the usual practice of arranging the ranks of the observed values of P in 
their natural order 1,2,3,...,n, and of assuming that no ties are present among these 
observed values (see Kendall, 1955). Let P; den. ote the rank of the ith observation (in their 
natural order) with regard to quality P, so that P, = i for i = 1,2,...,n. We shall assume 
herein that the conditional joint distribution 


Pr{Q<q,R<r|P}=F(q,7r|P) 


of the variables Q and R, when P, is given, is continuous so that the problem of ties among 
the observed values of Q and R can also be ignored. Thus, we assume that Q and R are 
continuous random variables. Only the rank orders of the actual observations of Q and R 
(and not the actual observations themselves) will be used in the subsequent statistical 
analysis. The symbols Q; and R; will denote the ranks of the actual observations Q® and 
R®, respectively, associated with a given P.. 

If Q© and R® are statistically independent of each other when P, is given, then 


Pr {Qo <q,R%<r | P,} = F(q,r | P;) 


is the product of the conditional distribution G(q|P;) of Q given P, and the conditional 
distribution H(r | P;) of R given P,;; i.e. 


F(q,r|P;) =G(q|P) Ar |P) (i =1,2,...,0). 


In this case, by adapting the usual meaning of partial association we find that the partial 
association between Q and R is zero when the variation of P is eliminated in a certain sense, 
but the expected ‘association between the agreements of [rankings] Q with P and those of 
R with P’ will in general not be zero. To see that the expected association between the agree- 
ments of rankings Q and R with P will in general not be equal to zero even when 


Fq,r|P) =G@q|P)Ar|P) (= 1,2,...,n), 
consider the situation where the following two conditions are true: 


(I) the » pairs (Q®, R®) are statistically independent observations (the conditional 
distribution F(q, r | P;) of each pair may depend on P,). 

(IT) Pr {Q® > Q®| P,, P} =5;_;, Pr{R© > R®|P,, P} =c,_,. 
(Condition (II) attaches somewhat more significance to the ranks of quality P than is 
usually done. This condition will be satisfied if, for example, the conditional distributions 
G(q| P;), where P, = 1,2,...,n, differ only in location, the location parameter is a linear 
function of P;, and a similar statement holds true for H(r | P;); this will often be the case 
when Q® and R® are linearly related to P;. It should be remembered that linearity also plays 
an important role in the usual theories of partial correlation for quantitative variables (see, 
for example, Cramér, 1946). For a description of some situations where conditions (I) 
and (II) can be assumed to hold true, the reader is referred to Goodman & Grunfeld, 
1959.) Assuming conditions (I) and (II), we see that, when F(q, r| P;) = G(q| P;) A(r| Pi): 
the following condition holds true: 


(III) Pr {Q® > Q®, RO > R®| P,, B} = by_,¢,_5. 
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Considering the }n(n—1) pairs (Q;, R;, P;), (Q;, R;, P;), where i > j, the expected pro- 

portion of these pairs that are such that Q;>Q; and R; > R; (given that P; > P;) is 

a = >b;_;¢;_;2/n(n— 1), while the expected proportion of these pairs that are such that 
i>j 

Q; > Q; is b = 4 b;_;2/n(n — 1), and the expected proportion that are such that R; > R; is 

i>j 

c= L¢,_;2/n(n—1). If the expected association between the agreements of rankings Q with 
i>j 

P and those of R with P were zero, then the expected proportion of these pairs that are 

such that Q; > Q; and R; > R; would be bc. Since a is in general not equal to bc, the expected 

association between the agreements of rankings Q with P and those of R with P will differ 

from zero even when the partial association (in the usual sense) between Q and R, given P,, 

is zero. Since Tg, p measures the intensity of association between the agreements of Q with 

P and those of R with P (see Kendall, 1955), justification of its use as a measure of partial 

association between Q and R, given P, is doubtful for situations of the kind considered 


herein. It is only in the special case where ¥ b;_;{c;_;—c] = 0 that the expected association 
i>j 


between agreements is zero when the partial association is zero. Thus, Tgr p does not 
measure the agreement between Q and R independently of the influence of P, except perhaps 
in situations where the ‘covariance’ between b;_; and c,_; is zero. The study of the partial 
association between Q and R, given P, is usually of interest in situations where the 
‘covariance’ between b,_; and c;_; is not zero. 

Assuming conditions (I), (IT) and (III) described in the preceding paragraph, we see that 
for the (n—1) pairs (Q;, R;, P;), (Q;, R;, P;), where i = j + 1, the expected proportion of these 
pairs that are such that Q; > Q; and R; > R; (given that P, > P,) is a, = b,c,, while the 
expected proportion of these pairs that are such that Q; > Q; is 6,, and the expected pro- 
portion of those that are such that R; > R; is c,. Since a, = b,c,, the expected association 
between the agreements of rankings Q with P and those of R with P, when the (n — 1) pairs 
of observations where 7 = j}+ 1 are compared, will be zero when the partial association (in 
the usual sense) between Q and R, given P,, is zero. Thus, there is some justification for the 
use of a measure of the association 7, between the agreements of rankings Q with P and those 
of R with P, as observed in the fourfold table giving the joint distribution of the agreements 
for the (n— 1) pairs of observations where i = j + 1, as a measure of an aspect of the partial 
association between Q and R, given P, in the situation considered here. The measure of 
association 7, based on the fourfold table for the (n — 1) pairs of observations where i = j +1 
could be computed in an analogous fashion to the usual computation of partial 7 based on the 
fourfold table for the 4n(n — 1) pairs of observations where i > j (see Kendall 1948); it could 
be given a probabilistic and operational interpretation similar to that given to such measures 
of association for fourfold tables and also analogous to, but somewhat different from, the 
usual interpretation for 7 (see Kendall (1955), Kruskal (1958), Goodman & Kruskal (1959)). 

By an approach similar to that presented above, we see that a measure of an aspect of the 
partial association between Q and R, given P, can be based on a measure of the association 
T, between the agreements of rankings Q with P and those of R with P, as observed in the 
fourfold table giving the joint distribution of the agreements for the (n — 2) pairs of observa- 
tions where i = 7+2. In general, the fourfold table giving the joint distribution of the 
agreements for the (n — k) pairs of observations where i = j + k (where k is a fixed constant), 
can be used to obtain a measure of the association 7,, between agreements that will measure 
an aspect of the partial association between @ and R, given P. The symbol k can denote any 
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fixed constant between 1 and n—1, but when k is close to n — 1 the frequencies appearing in 
the fourfold table will be too small. For each k, a different fourfold table will be obtained, 
and thus a different measure of association 7;, will also be obtained. Each of these measures 
can be given a probabilistic and operational interpretation. 

The fourfold table described by Kendall (1948), which is used to define partial 7, is the 
table obtained when the corresponding entries in the separate fourfold tables, computed 
for each fixed k, are added together for k = 1,2,...,n—1; i.e. this table gives the joint 
distribution of the agreements for the 4n(n—1) pairs of observations where i = j+k for 
k > 0, and partial 7 is a measure of the association between agreements in this table. Since 
the table obtained when the corresponding entries of a number of separate tables are added 
together need not indicate independence between the two qualities described by the table, 
even when there is complete independence in each of the separate fourfold tables, partial 7 
may differ from zero even when the measure of association 7;,, computed for each of the 
separate tables (for k = 1, 2, ...) is zero. Some justification can be found for basing a measure 
of partial association between Q and R, given P, on a weighted average of the measures of 
association 7,, obtained from the separate fourfold tables (k = 1, 2, ...), but less justification 
can be found, in the situation considered here, for the use of partial 7, a measure of association 
obtained from a fourfold table, which is a weighted average, in a certain sense, of the 
separate tables (except in the special case where the ‘covariance’ between b;_; and ¢,_;iszero). 

For the situation considered herein, the difference between the ranks P, and P; is meaning- 
ful. This meaningfulness leads in a natural way to the introduction of the separate fourfold 
tables and the corresponding measures of association 7;. Although no simple tests of 
significance are yet known for 7, it will be possible to develop such tests for the 7;, in the 
present situation (see §3). For situations where the difference between the ranks P, and P;is 
not meaningful, some justification can be given for the use of partial 7 (see Kendall (1942) 
and Hoeffding (1948)): less justification can be found for the use of 7; in these cases. 

For some further discussion of the relation between ordinal measures of association, such 
as 7 or partial 7, and measures of association computed for cross-classification tables, such 
as the fourfold tables considered here, the reader is referred to Kendall (1948), Kruskal 
(1958), Goodman & Kruskal (1959). A discussion of measures of partial association com- 
puted for cross-classification tables will be found in Goodman & Kruskal (1954). 


3. TESTS OF SIGNIFICANCE 


Let us first consider the fourfold table giving the joint distribution of the agreements of 
rankings Q with P and those of R with P, for the (n —1) pairs (Q,, R,, P;), (Q;, R;, P;), where 
t+=j+l1. 

It is easy to see that the usual x? test of independence for fourfold tables can not be used 
in this case because the (n — 1) pairs of observations, where i = 7+1 andj = 1, 2,...,n—1, 
are not statistically independent. However, assuming conditions (I) and (II) described in 
the preceding section, the 4(n—1) pairs where i = j7+1 andj = 1,3, 5,...,2—2 (in the case 
where v is odd) are statistically independent, given the P;, and so are the 4(n—1) pairs 
where i = j+1 and j = 2,4,6,...,~—1. Thus, if condition (III) of the preceding section is 
also assumed, then the usual large sample theory for tests of independence for fourfold 
tables can be applied in order to prove that the asymptotic distribution of 


Tuy = (@y,— by, €11) Vim /b én (1 7 6,1) (1—€,,)} 
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is normal with zero mean and unit variance, where @,, is the proportion of the $(n—1) = n, 
pairs (i = j+1,j = 1,3,5,...,n—2) that are such that Q; > Q; and R; > R,, 6,, is the pro- 
portion of these », pairs that are such that Q; > Q,;, ¢,, is the proportion of these , pairs 
that are such that R; > R; (see, for example, Pearson & Hartley, 1954, p. 71). Similarly, 
the asymptotic distribution of 


Tra = (G12 —b 4202) {04/8 12¢19(1 —b,.)(1- e12)} 
is normal with zero mean and unit variance, where @,, is the proportion of the n, pairs 
(i =j+1,j = 2,4,...,2—1) that are such that Q; > Q; and R; > R,, * is the proportion 
of these , pairs that are such that Q; > Q;, ¢,. is the proportion of these n, pairs that are 
such that R; > R;. Thus, a test of the null hypothesis that condition (ITI) is true can be 
based on either 7,, ./n; or on 7.,/n,. Since these two statistics are correlated, they cannot 
be combined in the most direct fashion to form a single test of significance. However, it 
can be seen that the covariance between (@,,—6,,@,;) J, and (@,—642,5) /n, is asymp- 
yr a fi=ad,—Bf, dy = Pr{Qiss > QO: > Qa}, 
9, =e,-cCf and e, = Pr{R,,, > R; > R,_1}, 
assuming that d, and e, are constant and do not depend on i (see Goodman & Grunfeld, 1959). 
This result can be generalized to situations where d, and e, are not constant, but the general- 
ization will not be presented here (see Hoeffding & Robbins, 1948, for some results related 


to this generalization). Using this result concerning the asymptotic covariance 2/,g,, we 
find that the distribution of 


z1 = (0-5, 6) /{(m —1)/[6,2,(1 —8,) (1-21) + rsa} 
is normal with zero mean and unit variance, where @, = 4(@,, +4@,,) is the proportion of the 
(n—1) pairs (¢ = j7+1,7 = 1, 2,3,...,n—1) that are such that Q; > Q; and R; > R,, 


6 =46utby), 4 =Heut+er), fr=a-8, 9,=4-4, 
d, is the proportion of the (x — 2) triplets {Q;,,,Q;,Q;_1} that are such that 
Qisr > Q > Qa 
and é, is the proportion of the (n — 2) triplets {R;,,, R;, R;_,} that are such that 
Ri > Ri > Ray 
(see Goodman & Grunfeld, 1959, for a detailed proof and also for some applications of this 
result). Since the measure of association 7, between the agreements of rankings @ with ? 


and those of R with P, as observed in the fourfold table giving the joint distribution of the 
agreements for the (n — 1) pairs of observations where i = j + 1 is simply 


7, = (@,- 6 @,)//{b, é,(1— 6,) (1—@,)} 
we see that the asymptotic distribution of z, = 7,,/{(n —1) /h,} is normal with zero mean 
and unit variance, where Din 2 & - A 
hy = 1+[2f,9,/6,é,(1 —6,) (1 —@,)]. 
Thus, a test of the null hypothesis that condition (III) is true can be based on 7, or, more 
precisely, on 7, 4/{(n—1)/h,}. For the situation considered here, this test of significance 


based on 7, provides a partial test of the hypothesis that the partial association between 
Q and R, given P,, is zero. 
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Let us now consider the fourfold table giving the joint distribution of the agreements of rank- 
ings Q with P and those of R with P, for the (n — 2) pairs (Q;, R;, P;), (Q;, R;, Pj), where t = 7 +2. 
The usual x? test of independence for fourfold tables cannot be applied in this case. However, 
assuming conditions (I) and (II), the $(n — 2) = n, pairs (t = 7 + 2,7 = 1, 2,5, 6,9, 10, 13, 14,..., 
in the case where n, is even) are statistically independent, given the P;, and so are the 
4(n—2) pairs (¢ = j+2, j = 3,4,7,8,11,12,...). Thus, if condition (III) is also assumed, 
then the large sample theory for tests of independence for fourfold tables can be applied in 
order to prove that the asymptotic distribution of 


To (Mo = (Bq — bo €s1) V{e]bo, €x(1 — bes) (1—@,,)} 


is normal with zero mean and unit variance, where @,, is the proportion of the n, pairs 
(¢ =j+2,j = 1, 2,5, 6,9, 10, ...) that are such that Q; > Q; and R; > R,, bo is the proportion 
of these n, pairs that are such that Q; > Q;, €.; is the proportion of these n, pairs that are 
such that R; > R;. Similarly, the asymptotic distribution of 


T294/Nq = (Bae — 559222) {nel [6222201 = bee) (1 —@y)} 


is normal with zero mean and unit variance, where @.9, 6. and ¢,. are defined in an analogous 
way for the n, pairs (i = 7+ 2, 7 = 3, 4,7, 8,11, 12,...). The null hypothesis that condition 
(III) is true can be based on either 7,, ,/n, or on 799 aa It can be seen that the covariance 
between (45, — bo, 001) Vy and (Bp —by9 09) /ng is asymptotically 2f,g., where 


So =da—b3, dy = Pr{Qis2 > Q; > Q-2}, 
Jo=C,—Cz and e,=Pr{R;,,.> R, > R;_3}, 


assuming that d, and e, are not functions of 7. (This result can be generalized somewhat, 
but this generalization will not be presented here.) Also, the distribution of 


Z_ = (A, —b,,) «/{(n — 2)/[b,05(1 — 5) (1 — ¢g) + 2fa9el} 


can be seen to be normal with zero mean and unit variance, where @ = }(@,, +@,») is the 
proportion of the n—2 pairs (i = j7+2, j = 1,2,...,n—2) that are such that Q; > Q,; and 
Rk, > R;, ¢ . 
B. = $(by, +520), 22 = 3(Cort+Cns), fe =da—53, 9. = &—2, 

d, is the proportion of the (n — 4) triplets {Q;,.. Q;, Q;_2} that are such that Q,,. > Q; > Qj-» 
and é, is the proportion of the (n — 4) triplets that are such that R;,, > R; > R,_,. Since the 
measure of association 7, between the agreements of rankings Q with P and those of R 
with P, as observed in the fourfold table giving the joint distribution of the agreements 
for the (n — 2) pairs of observations where i = j + 2 is simply 


Tz = (@—by0)/4/{620,(1 —6,) (1 —2,)}, 
we see that the asymptotic distribution of z, = 7,,/{(n— 2)/ha} is normal with zero mean 


and unit variance, where he = 1+[2f.9./6,¢,(1 —6,) (1 —@,)]. 


Thus, a test of the null hypothesis that condition (III) is true can be based on 7, (i.e. on 
Ts. J{(n —2)/h,}), which provides a partial test of significance of the partial association 
between Q and R, given P,. 
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Let us now consider the general case where the fourfold table giving the joint distribution 
of the agreements for the (n — k) pairs of observations (Q;, R;, P;), (Q;, R;, Pj), wherei = j +k, 
is of interest, where k is a fixed constant. The special cases where k = 1 and where k = 2 
were discussed above. The measure of association 7, between the agreements of rankings 
Q with P and those of R with P, as observed in this fourfold table for the (n—k) pairs of 


observations, s Simply 7, = (@,—6,¢,) Vb. 2(1-8,) (1 -2)} 


where 4, is the proportion of the (n—k) pairs that are such that Q; > Q; and R; > R,, 
j,, is the proportion of these (n — k) pairs that are such that Q; > Q,, and @, is the proportion 
of the (n — k) pairs that are such that R; > R;. We shall assume that the following condition 
holds true: 

(Ila) d, = Pr{Q:.. > Q; > Q;_,} and e, = Pr{R;,,, > R; > R;_,} are not functions of i 
‘his condition can be weakened in order to obtain somewhat more general results (see 
related results by Hoeffding & Robbins, 1948)). 

By an argument similar to that presented earlier herein, it can be seen that, under con- 
ditions (I), (II), (IIa) and (III), the asymptotic distribution (n — 00) of 7,, V{(n—k)/h,} is 
normal with zero mean and unit variance, when 


hy = 1+ [2 9ilb,(1—8,)(1-@)], fy, = dp 82, 9, = & - 2, 


d,,is the proportion of the (n — 2k) triplets (Q;,;.,Q;,Q;_;} that are such that Q,,, > Q; > Q;_,, 
and é, is the proportion of the (n — 2k) triplets {R;,,, R;, R;_;,} that are such that 


Risn > By > Rey 


The quantity f,,is an estimate of the ‘serial covariance’ of order k in the sequence of random 
variables S(Q;.41—@1), S(Qi+2—Qe2), S(Qr43—Qs), ---, and 9, is an estimate of the ‘serial 
covariance’ of order k in the sequence S(R,,,,—R,), S(Ry42—R,), S(Ry.3—Rs), ..., where 
S(X) = 1 when X > 0 ana S(X) = 0 when X < 0. A test of the null hypothesis that con- 
dition (III) is true can be based on 7, (i.e. on 7;,4/{(n—k)/h,}), which provides a partial 
test of significance of the partial association between @ and R, given P,. 

The tests here described have been applied, in the special case where k = 1, by Goodman 
& Grunfeld (1959). The reader is referred to this article for a more detailed discussion of the 
special case, and also for some numerical examples illustrating the application of the tests 
based on 7, to the analysis of comovements in time series. The tests based on 7, (for 
k = 1,2,...) described herein could also be applied to the analysis of co-movements in time 
series in a manner similar to the application of 7,. 
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MOMENTS OF ORDER STATISTICS FROM 
A NORMAL POPULATION* 


By R. C. BOSE} anp SHANTI 8S. GUPTA{ 
University of North Carolina 


1. INTRODUCTION AND SUMMARY 


Statistics based on ordered observations have been called systematic statistics by Mosteller 
(1946). They are now being increasingly used in new statistical procedures (Bahadur, 1950; 
Bechhofer, 1954; Bechhofer, Dunnett & Sobel, 1954; Bechhofer & Sobel, 1956; Godwin, 
1949; Gupta, 1956; Mosteller, 1948; Nair, 1948; Paulson, 1949, 1952; Seal, 1955). The present 
paper deals with the problem of obtaining the moments of X,,, the kth order statistic for 
a sample of size n from a normal population N(0,1). This problem has been considered 
among others by Cole (1951), Godwin (1949), Hastings, Mosteller, Tukey & Winsor (1947), 
Hojo (1931), Jones (1948), Paulson (1952), Ruben (1954), Teichroew (1956) and Tippett 
(1925). 

It has been shown that y;(n, k), the tth moment of Xy, can be expressed in terms of lower 
moments of order ¢ — 27 (i = 1, 2,..., 4¢ or 4(—1)) and the integral 


+00 
P,,4(x) e402" dar, (1-1) 
where P,, ,(z) for t > 0, is defined by 
dt 
Pa) = k (7) sul — 0), (1-2) 


it being understood that in (1-2), ® is replaced after differentiation by ®(x), the cumulative 
distribution function (c.d.f.) of N(0,1). P(x) is thus a polynomial of degree (n —t) in O(z) 
if t < m and is zero if t > nm. Exact values of all odd order moments can be derived when 
n < 5, and the exact values of all even order moments can be derived when n < 6. Godwin 
(1949) and Jones (1948) have given tables of exact moments y(n, k) for t = 1 and 2. The 
corresponding tables for ¢ = 3 and 4 are provided in this paper. In general the numerical 
evaluation of the integral (1-1) can be expeditiously done by using the Gauss method of 
numerical integration (Szego, 1939) based on the zeros and the weight factors of the Hermite- 
polynomials for which tables have been provided by Salzer, Zucker & Capuano (1952). 
Asufficiently large number of decimal places must, however, be retained in the computation, 
to secure an adequate number of significant figures in the final moments about the mean. 


2. THE Function P,(n,k, 2x) 


Let 21,2, ...,2,, be n independent observations from a normal population N(0, 1) with zero 


- : ; 
ean and unit variance, and let Iq) < Liq < «+» S My (2:1) 
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be the n ranked observations among 2, Xp, ...,%,,. Then = cumulative distribution function 
of X,, the random variable corresponding 0 xX), (1 < k < n), is given by 


P,(n, k,x) = Prob {Xq < 2} 


yyeopi- n—k o—ha? " 

ra -ami | x)-1 [1 — O(a)" e-4#* de, (2-2) 

1 ‘x 
where (x) is defined as O(x) = cy e-4* da, (2:3) 

(27)* J —c 

d C is th C = 
an is the constant = (k-1)!(n—b!" (2-4) 
Let us now define the function P(n,k,x), which we shall abbreviate to P(x) for con- 

venience, by the relation dP 

Pisa() = (2h, (25) 


where P,(x) is given by (2-2). Then 
P,(x) = C[O(x) [1 — B(x)". (26) 


It is clear that P,(2) is a polynomial of degree n —t in O(x) if 1 < t < n, and is zero for ¢ > n. 
In fact, we can write 


P,.x(2) = 04 [O11 —Oy-4, (27) 


where ©® is replaced by (x) after the differentiation. It follows that for given t, n, k, ®,(z) 
is a bounded function of x. The functions P(x), P,(x), P,(z) and P,(z) are given below 
explicitly, where ® is written for D(z). 


Py) = OOF-2(1 — )"--f(k — 1) —(n 1) O], | (2-8) 
P,(x) = C@*-3(1 — )"--2 [(k — 1) (k — 2) — 2(k— 1) (n—2) © + (n—1)(n—2) O27], (29) 
P,(a) = CO*-4(1 — @)"-*-8 [(k — 1) ( — 2) (k —3) —3(k — 1) (kh —2) (n—-3) © 

+3(k—1) (n—2) (n—3) ©? — (n — 1) (n —2) (n—3) ©], (2-10) 
P,(a) = C*-5(1 — @)"-k-4[(k — 1) (k — 2) (kh — 38) (k —4) —4(k— 1) (k — 2) (kh —-3) (n -4) © 

+ 6(k—1) (k—2) (n —3) (n—4) ©2 — 4(k— 1) (n —2) (n —3) (n —4) ©8 

+(n—1)(n—2) (n—3) (n—4) 4]. (211) 


3. A SYSTEM OF DIFFERENTIAL EQUATIONS SATISFIED BY P(x) 


From (2:5) P,(x) = (27) dT, (3+1) 


d | dP, | 
= (I77)t ota? — $ phx? 0 
P(x) (27)? e j (27)? e j 


,[@P, dP, 
= (2myer*| S704 MF, (3:2) 
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In general let us assume fel dip. 
Pa) = (27) et" ¥ g, (x) —-®, (3-3) 
r=0 dy' 4 


where Jo,(u) = 0, (3-4) 
and g, (x) is a polynomial in x of the rth degree. Differentiating (3-3) and using (2-5), we have 
wt 
Prxle) = mien lurve’ | g, (0) EP + (tag, 2) +9, 2) | eo 
This leads to the recurrence relation 
d 
drsa(®) = Gt) +,t2+ Th op ade, (3-6) 


where g,,(z) should be interpreted as zero. This together with (3-4) determines all the 
polynomials g, (x). Starting from 


Go,1(%) = 1 (3-7) 
we can successively calculate 
Jox(%) = 1, 94,0(%) = 2, (3-8) 
Jo,s(%) = 1, Gi 3(%) = 3%, Go5(x) = 207 +1, (3-9) 
Go,a(%) = 1, gy a4(z)= 6x, go g(x) = 1la*+4, gy (x) = 63 + 72, (3-10) 
Jo,s(%)=1, 91,5(x) = YJo,s(%) = 35a7+10, g(x) = 500% + 45a, 


Ja 5(%) = 2404+ 4622+7. (3-11) 

Hence we have the set of equations 
Me. 4 s6 
a Gmi’ P,(2), (3-12) 





@P, dP, 1 








det +" de = Gn — e-** P,(x), (3-13) 

— + 38a O84 (2a 241) = ante * Rt). (3-14) 

T+ 6a C2 (It 4 4) S84 (623 + 72 2) = — (3-15) 

= + 10x =e + (35a? + 10) is (5028+ 452) 50 4 (2424 + 462? + n=! = onl e-f* P(x). 
(3-16) 


We can proceed in this manner up to any order, but it should be noted that P,(x) is a 
constant and P(x) = 0 if t > n. The general equation is 


e~ba* P(x). (3-17) 


d'P. dP. dP, 
Jo, AX) at +91,(X) at +... +9 -1,4(%) =— _* 


(20 jae" 
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4. Moments oF Xi 


We shall first prove the following Lemma: 
Lemma. If « and r are non-negative integers, then 
+e H@n"e , 
Be a da —— idx = (—1l)'a(a—1)...(a-r+l1)y,_, or 0 (4:1) 


according asr < aorr > a, where j,_, is the (x —r)th order moment of X() about the origin. 
It should be noted that by definition 


sigs ye th 











ee — "dx = fy. (4-2) 
From (3-12) and (3°13) “ = ao tz* P (ax), (4:3) 
sal =- aa P(x) +—e** Pz), (4:4) 

and in general using the system of equations (3-12)—(3-17) we can write 
Fat = Shale) eH Pe), (4-5) 


where h(x) is a polynomial in x. Now 
1% dP, d’P,|\+ 


x* "dx = | x ae | 


= dart 








7 dP, 
= a—1_ 0 : 
*-[" a Pe dx. (4:6) 


Since P(x) for any non-negative integer ¢ is a bounded function of z, it follows from (4:5) 
that the first part on the right-hand side of (4-6) vanishes. Repeating this process we get 
ifr<a 
f © oP, 


‘datt r+1 


dx = (—1)’a(a—1)... (ami) ar 


-o dx 
= (—1)’a(a—1)...(@—r+1)m,_, 


—-o 


If r > a, we get, on repeating the process « times, 

















+00 Rated) = * +e gre F, 
[i dx* dx = (-1) a(a—1)...3.2.1 [Fade 
ra + 
= (—1)*a(a—1)...3.2.1 ed, 0. 
This proves the Lemma. 
On applying the Lemma and integrating the equations (3-13)—(3-16) we get 
re wey . 
a 3 a P,(x) e~* da (4-7) 
1 Bais 
= 3+ (+1) = | P(x) e-8* dax (4:8) 
(277) —2 
1 ste 
— 225 + (643+ 744) = (ne P,(x) e-*** dar, (4-9) 
70 — 150, — 45 + (244+ 46"5+7) = sat (a) e-8=* da (4:10) 
(27)! J. 
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We may write ,(n, k) instead of y, to denote the fact that we have the ath moment about 
the origin of the kth order statistic out of a sample of n observations from N(0, 1). We then 
have 


p(n, k) A © Pla)e*de, (4-11) 

f3(n, k) = a non}. P,(x) 3” dx, (4-12) 

pln, ) = BH (n, b)+ 5a | Ple)e* de, (4-13) 
Hiln, b) = $B ln, B+ [Po Pawye tae. (4-14) 


In general applying the Lemma to (3-17) we can express ju;(n, k) in terms of lower moments 
of even (odd) order when ¢ is even (odd) and the integral 











+o 
| P(x) M+ 2* de, (4-15) 
where the polynomials P,(x) ... P;(a) are given by (2-8)—(2-11). In the particular case when 
n =k, i.e., x, is the largest of x,, 2, ...,x,, P(x) assumes the very simple form 
P(x) = n(n—1) (n—2)... (n—t4+ 1) [O(x)]"—. (4-16) 
Hence we get y(n, n) = a 6 x)]"- e-** dx, (4-17) 
a(s){ 

p(n, n) Ont! 2 [ D(a) ]"-3 e-¥* dar, (4:18) 

() 
pa(n,n) = Sui(n,n) +a] (O(a) tee de, (4-19) 

+o 

(5) 

Ma(n,n) = —$+48u5(n, n) Omi - [D(x)]"- e-#* der. (4-20) 


It should be noted that in the formulae (4-17) to (4-20) [®(a)]"~ should be interpreted as 
zero if t > n. 

Some integrals of the type occurring in this paper have been numerically evaluated by 
Hojo (1931). Equations (4-11), ..., (4:14), together with equations (2-8), ...,(2:11) of our 
paper correspond to equations (19), ..., (22) on pp. 334-5 of Hojo’s paper. If we make the 
following correspondences between our and Hojo’s paper: 


k-—1—>p=no. below X,, n—k—>m = no. above Xy, 


our equation (4-11), together with (2-8), reduces to Hojo’s equation (19) and so on.* 


* [It appears that the functions 7’, I, S and K tabled by Hojo on pp. 325-6 will provide values of 
the Ist and 2nd moments of all order statistics of samples up to n = 13 inclusive and the 3rd and 4th 
moments up to n = 10 inclusive. His functions were, however, only calculated to 8 decimal places 
and when the raw moments are converted to moments about the mean, the effective number of 
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5. EXACT VALUES OF SOME MOMENTS 


Let I,(a) = we [D(ax)]" e da, (5-1) 
then I,(a) = 7. (5-2) 
Now @ [D(ax) —4]?"+1e-" dx = 0, (5:3) 


. 


since the integrand is an odd function of x. Hence 


2m+1 2m+1 
(9) tay nal) 


Tay 3() = = 5 (5-4 

In particular L,(a) = 41,(a) = 421, (5-5) 

and I,(a) = $1,(a) —3,(a) + Hy(a) = $4,() — F(a). (5+6) 
In general, J,,,,,,(a) can be expressed as a linear function of J,,,(@), Iam_2(@), ---, Io(a). 


Differentiating (5-1) with respect to a (this is justified in virtue of the uniform convergence 
of the integrals with respect to a, —0 <a < 0, and the continuity of the integrands), 


we get, for n = 2, dI, 1 ‘ 
=f = | —_—___—_—_,, (5-7) 
da mt (a? + 2) (a? + 1)8 
so that I,(a) = “are tan (,/[a? + 1}). (5:8) 
Using (5-6) I,(a) = => are tan (,/[a?+ 1]) — 4a. (5-9) 
20 


Since P,,,(%) is a polynomial in ®(a) of degree n —t — 1, by using (5-2), (5-5), (5-8) and (5:9), 
we can exactly evaluate (4:15) if nm < t+4. Hence we can exactly evaluate u;(n, k) for all 
odd values of t, if n < 5 and all even values of t if n < 6. Godwin (1949) and Jones (1948) 
have given tables of exact moments j; for ¢ = 1 and 2. The corresponding tables for t = 3 


and 4 are given below. 
Table 1. Malm k) 



































Nk : 
k=n k=n-1 k=n-2 
| 1” LS Bains) -_ = = 
2 2A = 1-41047 39589 | =e ” 
3 | 3A = 2-11571 09383 | 0 — 
| | 
4 | 2B +2C = 2-70042 57044 12A —6B—6C = 0-36156 66401 _ 
| 
5 | —544+5B+5C = 3-2248793638 | 204—10B—10C = 0-60261 10668 0 
| J 
Here: A= = 070523 69794 35, B -__! _ _ 0.06349 36359 34, 
i 24/(27)! 
15 
C= pyioretan (/2) = 1-28671 92162 55. 
‘7 


significant figures is very much reduced. As M. G. Kendall has pointed out (Biometrika, 1954, 41, 
p. 561) Hojo’s values for his S-functions as printed appear to be in error by the omission of a factor 
of 27 which seems to have been included correctly in his moment calculations.—Ed.) 
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Table 2. y4(n, k) 
































| 
k=n k=n-1 k=n-2 
| 
— Se ~ ee — - OE ee eT Ee ua 
| 
a 3-0 a wm 
3 3+a 3—2a — 
| = 4-19454 59401 = 0-61090 81198 
4 3+2a 3—2a | << 
= 5-38909 18802 |  =0-6109081198 
| | 
5 3+b+¢ 34+10a—4b—4¢ |  3-20a+6b+6c 
| = 652839 54861 = 085187 74505 | = 024045 41149 
6 | 8-5a+3b4+3e | 3425a-95-9 |  3-200+6b+6c | 
| = 759745 67578 | | = 11580891273 | = 0-24945 41149 | 
7 | esse | 
H a=! — 1.19454 59400 81, 6 =—° arctan (x/§) = 3-46675 52225 38 
: =— = i+ m = r 2)=3° : 
Saad J3 (2m) 3m ’ 


5 
c _ 45 = 0-05664 02635 46. 
4n? 


The authors wish to thank Miss Phyllis A. Groll of Bell Telephone Laboratories for 
numerical computation in Tables 1 and 2. 
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CONFIDENCE INTERVALS FOR THE EXPECTATION 
OF A POISSON VARIABLE 


By EDWIN L. CROW anp ROBERT 8S. GARDNER 


National Bureau of Standards, Boulder, Colorado, and U.S. Naval 
Ordnance Test Station, China Lake, California 


1. INTRODUCTION AND SUMMARY 


A table (Table 1) of two-sided confidence intervals for the expectation of a Poisson variable 
is presented for confidence coefficients 0-80, 0-90, 0-95, 0-99, and 0-999 and all values of the 
variable from 0 to 300. The essential characteristic of the tabled system can be stated 
by reference to the graphical description of confidence limits for the Poisson expectation 
by Pearson & Hartley (1954, p. 76). Following them, we denote the Poisson variable by c 
and its expectation by m and plot them as rectangular co-ordinates, upper and lower 
confidence limits for m being plotted vertically along with c. If the successive upper limits 
for c = 0,1, 2,... be joined by line segments and the lower limits likewise joined, a semi- 
infinite region, called the conjidence belt, is formed between the upper and lower broken lines. 
The confidence belt tabulated herein is determined by the properties (i) that, of all confidence 
belts for the Poisson expectation with a given confidence coefficient which are based directly 
on the Poisson variable c, it is as narrow as possible if width is measured in the horizontal 
c-direction; and (ii) that, of all such equally narrow confidence belts, it is that one with the 
smallest possible wpper confidence limits. The second condition serves to minimize the 
lengths of the confidence intervals satisfying (i) for small values of ¢ at the expense of 
inappreciable relative increase in length for larger values of c. 

Thus the tabulated system of confidence intervals, or limits, has an optimum property 
in a geometrical sense, and in fact is the same type of system as that tabulated by Crow 
(1956) for the parameter of the binomial distribution. While the method of calculation is 
indicated in §3 below, a fuller discussion of definitions and calculations is given with the 
binomial table. A useful simplification of the method of calculation was possible for the 
Poisson distribution by virtue of an interesting property of its probability sums not 
possessed by the sums of binomial probability terms (§ 2). 

Because of the optimum property the tabulated intervals would seem to be preferable, 
in many applications, to the central intervals tabulated by Garwood (1936), Ricker (1937), 
and Pearson & Hartley (1954), for which the probability allowed in each tail of the distribu- 
tion is not more than $e, where the confidence coefficient is taken as 1 —e. The percentage 
decrease in length of interval achieved in going to the new intervals is shown in § 4. Since 
the new intervals impose no precise restriction on the division of probability between the 
two tails, they are not proposed for obtaining one-sided confidence intervals. 

The term ‘confidence coefficient’ and notation 1 —e is used herein to refer to the lower 
bound on the probability that a confidence interval for the Poisson expectation m does 
include m; the probability almost never falls to 1 — ¢ because the distribution of c is discrete. 
A number of papers have shown how the confidence coefficient for the parameter of a dis- 
crete distribution may be precisely attained by adding a random variable uniformly dis- 
tributed over (0, 1) to the observed discrete variable. Furthermore, this device can produce 

28 Biom. 46 
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the shortest unbiased system of confidence intervals (Eudey, 1949; Blank, 1956). Stevens 
(1957) has given a practical procedure for calculating such a randomized interval for either 
the binomial or Poisson parameters. Although his procedure is short (of the order of 15 min. 
in a trial by the present writers), the almost universal occurrence of the Poisson distribution 
seems to warrant the publication of a table giving an optimum interval directly. 


2. A PROPERTY OF POISSON PROBABILITY SUMS 


The Poisson distribution is the distribution of a random variable c which has the expectation 
m and the probability aliieadl 
p,m) = il (i = 0,1,2,...; m > 0) (1) 





of taking on the value 7. We note the well-known fact that for fixed m the sequence (1) has 
a unique largest number except that when m is an integer the maximum is attained by two 
consecutive members. On each side of the largest member(s) p,(m) is a strictly monotone 
function of i. 

On the other hand, a sum Ca 


xX pm) (ey < ¢2) 
=, 
over a fixed set (c,,c,+1, ..., c,) is an analytic function of m and has a unique maximum 
for m satisfying se 
Fm ae Pil™) = Po,s(™) —Pe,(m) = 0 (cy > 0), (2) 
i=q, 
that is, for M = Meo, 0, = [C4(C, + 1)... Cg]2-*. (3) 


(Equation (3) applies for c, = 0, as does (2) if p_,(m) is defined to be identically zero.) 
Since p,,1(m) = p,,(m) according as m = mz, .,, (¢, > 0), it follows that 


> Pilon) SS pil) according asm 5 Moy (4) 
i=q,—- 1=c, 
We note that m,, ,, increases as either one of its subscripts increases. From (4) we see that, 
for m between m,,_1,.,-1 and M,, 6,5 
¢,—1 Ca C2+1 
x pim)> Lplm)> LY pm) >.... 
— i=c,—1 i=, i=c,+1 
Also for this interval 
€,—1 C_—2 C,—3 
x plm) > X pilm)> LX pilm) > 
: i=c,—-1 i=c,—2 t=c,-3 
Hence for fixed r 
c+r a+r 
max p> p(m) = >> p;(m) (Me,,c,44 <M FT Meitey+r+1> 1 = 0,1,...;7 = 0,1,...), & 
C=VU, 1,.-. V=C =C, 


and these maximum sums form a non-increasing function of m, as illustrated in Fig. 1. 


3. CALCULATION OF THE TABLE 


The calculation of Table 1 printed on pp. 448-53 below is of the same general nature as that 
described in detail by Crow (1956) for the binomial parameter. The main body of calculation 
was performed on a high-speed electronic computer which calculated in the process all 
required terms of the Poisson distribution from (1), without reference to existing tables. 
The confidence limits for c = 0 (1) 20 were checked by independent calculation using a desk 
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calculator and the tables of Poisson individual terms and sums of Molina (1942) and Pearson 
& Hartley (1954), and Kitagawa’s (1952) table of individual terms. Both calculations were 
made to three decimal places of the parameter m, but accuracy of the last place was not 
always sought in the desk calculation. 














! T T T T 1 T T i 
1 —~c=0 7 
rr ~~" c=1 4 
! — - = c= 3 
c+r ! ~ =< C= 
X p;(m)-- ' - 2 c=4 =| 
j=c 1 1 
! : ' t 
q Pw Nas 
' i | oi 
1 r ! 
' 1 ' ! 
7 
' ' ‘ 
! 
0 i! a oe ©. ee ! 
Mo, r ™, +1 ™2,7+2 ™3,7+3 4, 7+4 


m 
Fig. 1. Poisson probability sums illustrated with r = 3. The characteristics are general except 
that for r = 0, c = 0 the slope is negative at m = 0. 


The calculation of a system, say 6 = d(c; m,e), of confidence intervals for m with con- 
fidence coefficient 1—e from observations on the Poisson variable c reduces to the deter- 
mination, for each m, of acceptance regions A(m) consisting of consecutive integers 
¢, > 0, ¢, +1, ..., c, such that 
x pi(m) > 1-e (6) 


i=c, 

and satisfying alternative further conditions sufficient to define the system uniquely. The 
central confidence intervals previously tabulated will be denoted by 6, and the system in 
Table 1, defined by the two properties stated in § 1, will be denoted by 6, (following Crow’s 
notation for the analogous binomial intervals). The system 4, is defined by Sterne’s (1954) 
condition that each p,(m) included in the sum (6) be not smaller than each p;(m)excluded; 
an immediate consequence of this condition is that the corresponding acceptance regions 
A,(m) are as short as possible, in other words, that 6, has property (i) stated in§ 1. The system 
6, arose as a modification of 6, which is made possible by the discreteness of c and the 
continuity of the sum (6) as a function of m for fixed c, and cy. 

The acceptance regions A,(m) and A,(m) are determined by proceeding continuously 
from small to larger values of m, starting at m = 0, and making a change in c, or c, or both 
in (6) as required. If for a given m, say m’, the acceptance region A,(m), (i = 3, 4), is 
(cj, ...,€3 = ¢, +7’), then by definition of the A;(m) no probability sum of less than r’ +1 
terms is as large as 1 —e¢ for either m = m’ or, consequently, by equation (5) and the con- 
cluding statement of § 2, for m > m’. Hence the size of the acceptance region A,(m) (i = 3, 4) 
cannot decrease as m increases. (This is not true for the corresponding binomial distribution 


acceptance regions.) We therefore maintain the same size if possible by substituting c+ 1 
cg+1 
for A,(m), or where b> p,(m) first attains 1 —e for A,(m). If the 
t=cj+1 


probability sum drops to 1—e before substitution is possible, we enlarge A,(m) by adding 


28-2 


, 
for cy, at Mm = Mer 41,6441 
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c3+1 to (cj, ...,¢3). Alternatively, it would be possible to shift or enlarge the acceptance 
region to the left, adding c,—1, but to no useful purpose; in fact it would yield confidence 
sets that are pairs of intervals rather than single intervals.* 


4, COMPARISON OF SYSTEMS OF CONFIDENCE INTERVALS 
For the purpose of comparison the 6, system was also computed for c < 50 and the 6, system 
obtained from Pearson & Hartley’s tables 40 and 8. Some lengths d, of 6, and corresponding 
algebraic and percentage decreases in length from d, are shown in Table 2 for confidence 
coefficients of 90 and 99 %. The results are rounded. 


Table 2. Comparison of d,, d, and d, 



































| 
d, d,—d, | 100(d,—d,)/d, sis, «hae 
c | 
90 99 90 | 99 | 90 99 90 99 
| 
} 
| 0 300 | 5:30 | 0-56 | 0-53 | 18-7% | 99% 0-1% 0-1% | 
1 469 | 743 | O27 | 0-52 | 5-7 7-0 — 3-3 13 | 
Lt 5-94 917 | 0-50 0-59 8-4 6-5 0-0 0-0 | 
| ¥ 6-94 10-64 | 0-53 0-60 7-6 57 0-9 2-0 
| 4 7-79 11-92 | 0-94 0-37 12-0 | 31 | 64 —0-5 
5 8-54 1307 | 1:26 | 0:56 14-7 4:3 5-4 0-3 
ine 9-23 1412 | 032 | 0-63 35 45 2-4 0-7 
= 9-86 1510 | 092 | 0-63 9-3 4-1 4-2 0-7 
8 10-45 16-01 | 1-43 0-55 13-7 34 | 465 0-7 
| 9 11-01 1687 | 0-24 0-91 | 2-2 5-4 3-8 2-9 
| 10 11-54 17-68 | 1-52 1-13 13-2 6-4 4:3 2-5 
| 12 12-52 19:20 | 1-63 0-21 13-0 Ll 4-1 0-9 
14 13-42 20-61 | 0-99 1-28 7-4 62 | 33 2-0 
| 16 14-26 21-91 | O15 0-81 i-l 37 | (19 13 
| 18 15-06 23:15 | 0-24 0-04 1-6 02 | 34 0-6 
| 20 15-81 24-32 | 1-27 0-61 8-0 2-5 2-1 1-4 
| 25 17-54 27-00 1-71 0-42 | OT 16 | 32 1-6 
| 30 19-10 29-44 0-03 1:30 | 0-2 44 | 29 1-5 
| 40 21-87 33-77 0-90 1:25 | 41 37 | 16 0-7 
| 60 24-33 37-61 0-08 | 1-74 | 0-3 4-6 1-4 11 
| | 














The differences d,—d, (observed for confidence coefficient 95% as well as those listed 
above and for somewhat more values of c than listed) oscillate rapidly within the range 0 
to 2 as c varies, with no substantial trend as either c or confidence coefficient varies. Con- 
sequently the relative improvement of 6, over 6, is appreciable for at least small values of ¢ 
but tends to decrease as c increases. The relative improvement of 6, over 6, is also appreciable 
for small values of c but occasionally becomes negative. However, since 6, and 4, have 
acceptance regions of the same size, the average of (d, —d,)/d, over c = 0,1, ... to any finite 
value exceeds the corresponding average of (d, —ds)/d,. 


* The confidence sets for the parameter of a binomial distribution considered by Clunies-Ross (1958) 
occasionally consist of pairs of intervals, including cases for confidence coefficients 0-90 and 0-99. 
Hence his statement on p. 278: ‘In practice it seems that the restriction (4.6 a, b) is not needed for the 
usual values of «= 0-9500, 0-9000’, needs correction; see Crow (1956), p. 425. 
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It is of interest to compare the tabulated 4, limits with various large-sample approxi- 
mations in order to test the validity of using the 6, limits as far as they are tabulated 
and an approximation beyond that. Such a comparison is summarized below for 
confidence coefficient 95% and c = 50: 


Table 3. Comparison with large-sample approximations 





























% Diff. of 
Identification of interval Eq. no. in Upper Lower Length of length 

| Hald (1952) limit limit | interval from d, 

| 6, _— 65-92 37-11 28-81 0-0 

| 33 — | 6585 | 37-67 28-18 22 | 

1 4 — | 6495 | 37-67 27-28 —5:3 

Normal approx. without | 

continuity correction — | 65:91 | 37-93 | 27-98 —2-9 

| Normal approx. with | | | 

| continuity correction | (22.5.4) | 6648 | 37-50 | 28-98 | +06 

| Square-root normal approx. | | | 

| with continuity correction | (22.5.5) 65-39 36-67 28-72 | —03 


Mean of (22.5.4) and (22.5.5) ot 65-93 | 37-08 28-85 | +01 | 





Quite generally the continuity correction lengthens the interval by very close to 1. The 
mean of the two approximations with continuity corrections is closer to 6, than either one 
singly for both upper and lower limits in the above instance, and this seems to be the case 
for c > 50 and even substantially smaller c (cf. Hald, p. 718). Consequently the suggested 
large-sample approximation, effectively identical with the mean of (22.5.4) and (22.5.5), 
but simpler by virtue of averaging the two approximations to percentage points for a given 
m (Hald’s (22.2.7)) before solving for the limits, is 


m 
wo} ce oe b+ Bul tu, let 4+ dul) 7) 
L 


where wu, is the upper 100¢/2 percentage point of the normal distribution of mean 0 and 
variance 1, and the upper signs are for the upper limit m,,, the lower for the lower limit m,. 
Equation (7) is a remarkably accurate approximation to 6,; the absolute error in m,,or 
m;, apparently is less than 0-1 for c > 0 when ¢ = 0-10 and for c > 2 when e = 0-05. An 
alternative approach is that of Walsh (1954), who considered, rather than the correction 
terms of (7), functions of e which he then determined so as to render probability errors small 
over a wide range. Comparisons similar to the above for confidence coefficients of 90 and 
99°, show about the same order of differences. Since the 6, intervals are shorter than 6, 
intervals in all observed cases and the percentage difference in length tends to zero as c 
becomes infinite, it is concluded that using 6, intervals as far as they are tabulated and the 
large-sample approximation (7) for larger c is a valid and satisfactory procedure. 

For comparison with 6, the randomized 90 % confidence interval was calculated according 
to the procedure of Stevens (1957) for the first few values of c and intervals of 0-1 in the added 
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uniformly distributed variable.* The mean lengths, say d;, of these intervals for each ¢ man 
are compared in Table 4 with the corresponding values of d,: cale’ 
thes 
Table 4. Comparison with mean randomized intervals expe 
0-06 
c ds d, 100(d,—d,)/d, it ex 
can 
0-9 | 
0 2-04 2-44 —16-4% 
1 3-90 4-43 —12-0 
2 5-11 5-44 — 61 
3 6-08 6-41 — 51 BLA: 
4 6-91 6-85 + 0-9 
5 7-66 7-28 + 5-2 Civ 
6 8-33 8-91 + 65 
7 8-96 8-94 + 0-2 — 
8 9-54 9-02 + 58 tive 
9 10-10 10-77 — 62 
10 10-62 10-01 + 61 Gar 
Hat 
The system 6, varies irregularly in length and location while d; varies smoothly. How- Kir 
: . . ‘ ° Mor 
ever, in any particular case 6; is determined in part by a random number x between 0 and | tt 
and hence will take on some irregularity; as an extreme example, the 90% interval for 
= 0 and z < 0-05 has length 0. Ric! 
Crow (1956) discussed briefly, in relation to the binomial parameter, the use of the gre 
expected length of a confidence interval as a criterion for choice of interval. In a similar Sre 
14 T T T T oe Sr" 
12 Wai 


= 
oO 


Expected length of interval 











0 2 4 6 8 10 12 14 


Fig. 2. Expected lengths of various confidence intervals for 90% confidence. 
6,: central; 6,: Sterne-type; 6,: tabulated herein; 5,: randomized central. 


* A slight change from his procedure was in using for c = 0, 1, 2, 3, the exact formula (Stevens, 
1950) rather than the quadratic Taylor’s series approximation. The approximation is not always 
accurate to within 0-1 for such small c, so that it is desirable to calculate both the forward extrapolate 
(from x = 0, in Stevens’s notation) and the backward extrapolate (from x = 1). The true limit is usually 
between the two extrapolates, but one exception was found. The forward extrapolate is more accurate 
than the backward extrapolate. 
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manner the expected lengths of 6,, 63, 6,, and the shortest unbiased system, say 6,, were 
calculated for small values of m, the last one by Stevens’s procedure, using Molina’s table; 
these expected lengths are graphed in Fig. 2. The randomized system 6; has the smallest 
expected length in the range of calculation, but the difference from that of 4, is not more than 
0-06 for 6 < m < 14. The expected length of 6, lies between those of 6, and 6, except that 
it exceeds that of 4, slightly for 0-01 < m < 1-5. By using 4, instead of 6, the experimenter 
can ‘expect’ a 90% confidence interval shorter by between 0-45 (for m = 1) and about 
0-9 (for 7 < m < 14). 
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Confidence coefficient, 100(1—e) % 


Table 1. Confidence limits for the expectation of a Poisson variable 
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80 90 95 99 99 
Lower | Upper | Lower | Upper | Lower | Upper | Lower | Upper | Lower | Upper 
0 1-819 0 2-436 0 3-285 0 4-771 0 7-065 | 
0-223 3-546 0-105 4-532 0-051 5-323 0-010 6-914 0-001 9-561 
0-824 4-758 0-532 5-976 0-355 6-686 0-149 8-727 0-045 | 11-532 
1-535 5-882 1-102 7-512 0-818 8-102 0-436 | 10-473 0-190 | 13-596 
1-819 7-564 1-745 8-597 1-366 9-598 0-823 | 12-347 0-429 | 15-144 
2-645 8-529 2-433 9-716 1-970 | 11-177 1-279 | 13-793 0-739 | 17-114 
3-546 | 9-922 2-436 | 11-342 2-613 | 12-817 1-785 | 15-277 1-107 | 18-490 
3-914 10-969 3-589 | 12-531 3-285 | 13-765 2-330 | 16-801 1-520 | 19-987 
4-758 | 12-481 4-532 | 13-553 3-285 | 14-921 2-906 | 18-362 1-971 | 21-603 
5-696 | 13-243 4-532 | 15-298 4-460 | 16-768 3-507 | 19-462 2-452 | 23-170 
5-882 | 15-205 5-976 | 15-985 5-323 | 17-633 4-130 | 20-676 2-961 | 24-677 
7-564 | 15-438 5-976 | 17-810 5-323 | 19-050 4-771 | 22-042 3-491 | 26-126 
7-564 | 16-914 7-512 | 18-403 6-686 20-335 4-771 | 23-765 4-042 | 27-530 
8-529 | 18-537 7-512 | 20-054 6-686 | 21-364 5-829 | 24-925 4-611 | 28-907 
9-922 | 18-938 8-597 | 21-035 8-102 | 22-945 6-668 | 25-992 5-195 | 30-372 
9-922 20-414 9-484 | 22-258 8-102 | 23-762 6-914 | 27-718 5-794 | 31-993 
10-969 | 22-037 9-716 | 23-824 9-598 | 25-400 7-756 | 28-852 6-405 | 33-622 
12-481 | 22-326] 11-342 | 24-452 9-598 | 26-306 8-727 | 29-900 7-028 | 34-745 
12-481 | 23-744 | 11-342 | 26-158] 11-177 | 27-735 8-727 | 31-839 7-065 | 35-916 
13-243 | 25-707] 12-531 | 26-935 | 11-177 | 28-966 | 10-009 | 32-547 8-023 | 37-384 
15-205 | 25-707 | 13-553 | 28-092 | 12-817 | 30-017] 10-473 | 34-183 8-840 | 39-108 
15-205 | 26-972 | 13-553 | 29-988] 12-817 | 31-675] 11-242 | 35-204 9-561 | 40-124 
15-438 | 28-469] 15-298 | 30-179 | 13-765 | 32-277 | 12-347 | 36-544 9-561 | 41-245 
16-914 | 29-983 | 15-795 | 31-639 | 14-921 | 34-048 | 12-347 | 37-819 | 10-710 | 43-041 
18-537 | 30-152 | 15-985 | 33-444] 14-921 | 34-665] 13-793 | 38-939] 11-532 | 44-162 
18-537 | 31-507 | 17-810 | 33-643 | 16-768 | 36-030 | 13-793 | 40-373 | 11-532 | 45:213 
18-94 | 33-03 18-28 35-08 16-77 | 37-67 15-28 41-39 12-73 47-08 
20-41 | 34-42 18-40 | 37-00 17-63 | 38-165 | 15-28 42-85 13-60 | 48-01 
22-04 34-585 | 20-05 | 37-04 19-05 | 39-76 16-80 43-91 13-60 | 49-32 
22-04 35-92 21-035 | 38-44 19-05 | 40-94 16-80 | 45°26 14-89 50-97 
22-33 37-39 21-035 | 40-105 | 20-335 | 41-75 18-36 | 46-50 15-14 | 51-76 
23-74 39-07 22-26 40-99 21-36 | 48-45 18-36 | 47-62 16-11 | 53-54 
25-71 39-07 23-82 41-74 21-36 | 44-26 19-46 49-13 17-11 | 54-458 
25-71 40-235 | 23-82 43-22 22-945 | 45-28 20-285 -| 49-96 17-11 55:88 
25-71 41-62 24-45 44-87 23-76 | 47-025 | 20-68 | 51-78 18-49 57-15 
26-97 43-25 26-16 | 45-00 23-76 | 47-69 22-04 52-28 18-49 58-20 
28:47 | 44-20 26-935 | 46-38 25-40 48-74 22-04 54-03 19-87 59-878 
29-98 44-48 26-935 | 47-97 26-31 50-42 23-765 | 54:74 19-99 60-63 
29-98 45-79 28-09 | 49-12 26-31 51-29 23-765 | 56-14 21-27 62:67 
30-15 47-20 29-99 49-56 27-735 | 52-15 24-925 | 57-615 | 21-60 | 63-18 
31-51 | 48-99 29-99 | 50-96 28-97 | 53-72 25-83 | 58-35 22-68 65:13 
33-03 | 49-46 30-18 | 52-64 28-97 | 54-99 25:99 | 60-39 23-17 65:70 
34-42 49-945 | 31-64 | 53-46 30-02 | 55-51 27-72 | 60-59 24-13 | 67-62 
34-42 | 51-25 33-44 | 54-05 31-675 | 56-99 27-72 | 62-13 24-68 68°19 
34-585 | 52-64 33-44 | 55-445 | 31-675 | 58-72 28-85 | 63-635 | 25-63 | 70-12 
35-92 54-29 33-64 | 57-10 32-28 | 58-84 29:90 | 64-26 26-13 | 70-66 
37-39 | 55-16 35-08 57-99 34-05 | 60-24 29:90 | 65-96 27-18 72-66 
39-07 55-33 37-00 | 58-48 34-665 | 61-90 31-84 | 66-815 | 27-53 73°10 
39-07 | 56-61 37-00 | 59-85 34-665 | 62-81 31-84 | 67-92 28-76 | 74-96 
39-07 | 57-95 37-04 | 61-41 36-03 63-49 32-55 69-83 28-91 75°52 
40-235 | 59-44 38-44 | 62-69 37-67 64:95 34-18 70-05 30-37 | 77-22 









If c is the observed frequency or count and mz, my are the lower and upper 100(1—€) % 
confidence limits for its expectation, m, then Pr (mz < m < my)> 1—€. 
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Table 1 (cont.) 


Confidence coefficient, 100(1—«€) % 

































































80 90 95 99 99-9 
Lower Upper Lower Upper Lower Upper Lower Upper Lower Upper 
41-62 61-02 40-105 62-86 37-67 66-76 34-18 71-56 30°37 78-14 
43-25 | 61-02 40-99 64-19 38-165 66-76 35-20 | 73-20 31:99 | 79-53 
44-20 61-90 40-99 65-645 39-76 68-10 36-54 | 73-62 31-99 80-83 
44-20 63-19 41-74 67-45 40:94 | 69-62 36°54 75-16 33-62 81-845 
44-48 64-56 43-22 67-45 40:94 | 71-09 37-82 76°61 33-62 83-50 
45°79 | 66-12 44-87 68-49 41-75 | 71-28 38:94 | 77-15 34-745 84-17 
47-20 | 67:33 44-87 | 69-86 43-45 | 72-66 38-94 | 78-71 35°59 | 86-16 
48:99 | 67-33 45:00 | 71-42 44-26 74-22 40:37 | 80-06 35:92 | 86-49 
49-46 | 68-38 46°38 | 72-67 44-26 | 75-49 41-39 80-65 37°38 | 88-14 
49-46 69-675 47-97 | 72-76 45°28 | 75-785 41-39 82-21 37:38 89-23 
49-945 | 71-04 49-12 | 74-07 47-025 77-16 42-85 83-56 29-11 | 90-34 
51:25 | 72-585 49-12 | 75:47 47-69 78-73 43-91 84-12 39-11 | 92-01 
52-64 | 73°87 49:56 | 177-155 47-69 | 79-98 43-91 85-65 40-12 | 92-59 
54:29 | 73-87 50:96 | 77-94 48:74 | 80-25 45-26 87-12 41-21 | 94-36 
55-16 | 74:79 52-64 | 78-27 50-42 | 81-61 46-50 87-55 41-245 | 95-015 
55°16 | 76-07 53°46 | 79°59 51-29 | 83-14 46:50 | 89-05 43-04 96-40 
55:33 77-40 53-46 | 80-99 51:29 | 84-57 47-62 90-72 43-04 | 97-95 
56°61 | 78°85 54:05 82-72 52-15 84-67 49-13 90-96 44-16 | 98-59 
57-95 | 80-60 55:445 | 83-37 53-72 86-01 49-13 | 92-42 45-18 | 100-36 
59-44 | 80-60 57:10 | 83-73 54-99 87-48 49:96 | 94-34% 45-21 | 101-02 
61:02 | 81-13 57-99 | 85-04 54-99 89-23 51-78 94-35 47:08 | 102-34 
61:02 | 82-38 57-99 | 86-43 55°51 89-23 51-78 95-76 47-08 | 104-01 
61:02 | 83-67 58-48 88-05 56-99 90°37 52:28 | 97-42 48-01 | 104-50 
61:90 | 85:04 59-85 | 89-04 58-72 91-78 54-03 | 98-36 49-32 | 106-13 
63-19 | 86-58 61-41 | 89-15 58-72 | 93-48 54:74 | 99-09 49-32 | 107-235 
64-56 | 87-87 62-69 | 90-44 58:84 | 94-23 54-74 100-61 50-97 | 1°2-17 
66:12 | 87-87 62-69 91-79 60-24 | 94-705 56-14 102-165 51-39 | 1 0-17 
67:33 | 88-64 62-86 93-29 61:90 |, 96-06 57-615 | 102-42 51-76 | 110-34 
67:33 | 89-89 64-19 94-80 62-81 | 97-54° 57-615 | 103-84 53-54 | 111-85 
67°33 91-19 65-645 | 94-80 62-81 | 99-17 58-35 105-66 53-54 | 113-51 
68-38 92-55 67-45 95-78 63-49 | 99-17 60-39 106-12 54-455 | 113-95 
69-675 94-115 67-45 97-10 64:95 | 100-32 60-39 107-10 55:88 | 115-52 
71:04 | 95-34 67-45 | 98-51 66:76 101-71 60-59 108-615 55°88 | 116-84 
72-585 95-34 68-49 | 100-28 66-76 | 103-315 62-13 | 110-16 57-15 | 117-54 
73:87 96-08 69-86 100-81 66-76 104-40 63-635 | 110-37 58-20 | 119-21 
73°87 97-32 71-42 | 101-095 68-10 | 104-58 63-635 | 111-78 58-20 | 120-17 
73:87 | 98-60 72-67 102-38 69-62 | 105-905 64-26 | 113-45 59-875 121-12 
74:79 | 99-95 72-67 103-71 71-09 107-32 65:96 | 114-33 60-49 | 122-92 
76-07 101-43 72-76 105-17 71-09 109-11 66-815 | 114-99 60-63 123-46 
77-40 | 103-02 74:07 | 106-92 71-28 109-61 66-815 | 116-44 62-67 124-68 
78-85 | 103-02 75-47 106-92 72-66 110-11 67-92 118-33 62:67 | 126-67 
80-60 | 103-45 77-155 | 107-63 74-22 111-44 69-83 | 118-33 63-18 | 126-75 
80-60 104-68 77-94 108-915 75:49 112-87 69°83 | 119-59 65-13 128-21 
80-60 105-94 77-94 | 110-26 75-49 114-84 70:05 | 121-09 65-13 130-14 
81-13 107-24 78-27 | 111-75 75-785 | 114-84 71-56 | 122-69 65-70 | 130-26 
82-38 108-63 79-59 | 113-35 77-16 115-605 73-20 | 122-78 67-62 131-72 
83-67 110-275 80-99 | 113-35 78:73 116-93 73-20 124-16 67-62 133-64 
85-04 111-18 82-72 | 114-11 79-98 | 118-35 73-62 | 125-70 68-19 133-75 
86-58 | 111-18 83-37 | 115-40 79-98 | 120-36 75-16 | 127-07 70-125 | 135-20 
87-87 | 111-98 83-37 116-74 80-25 120-36 76-61 | 127-31 70-125 | 137-15 












Table 1 (cont.) 


Confidence coefficient, 100(1—e)% 



































































80 90 
| | 
Lower | Upper | Lower Upper 
Ss : 
87-88 | 113-21 | 83-73 | 118-21 
87-88 | 114-475 | 85-04 | 119-88 
88-64 | 115-79 86-43 | 119-88 
89-89 | 117-19 88-05 | 120-55 
91:19 | 118-95 89-04 | 121-82 
92-55 | 119-50 89-04 | 123-15 
94-115 | 119-50 89-15 | 124-57 
95:34 | 120-44 90-44 | 126-52 
95-34 | 121-67 91-79 | 126-52 
95-34 | 122-925 | 93-29 | 126-945 
96-08 | 124-23 94-80 | 128-20 
97-32 | 125-61 94-80 | 129-50 
98-60 | 127-22 94-80 | 130-87 
99-95 | 128-23 95-78 | 132-44 
101-43 | 128-23 97-10 | 133-66 
103-02 | 128-84 98-51 | 133-66 
103-02 | 130-06 | 100-28 | 134-54 
103-02 | 131-30 | 100-81 | 135-81 
103-45 | 132-575 | 100-81 | 137-13 
104-68 | 133-91 | 101-095 | 138-55 
105-94 | 135-37 | 102-38 | 140-54 
107-24 | 137-08 | 103-71 | 140-54 
108-63 | 137-08 | 105-17 | 140-85 
110-275 | 137-18 | 106-92 | 142-09 
111-18 | 138-38 | 106-92 | 143-37 
111-18 | 139-605 | 106-92 | 144-71 
111-18 | 140-85 | 107-63 | 146-16 
111-98 | 142-14 | 108-915 | 147-94 
113-21 | 143-49 | 110-26 | 147-94 
114-475 | 145-01 | 111-75 | 148-35 
115-79 | 146-45 | 113-35 | 149-60 
117-19 | 146-45 | 113-35 | 150-88 
118-95 | 146-66 | 113-35 | 152-21 
11950 | 147-86 | 114-11 | 153-67 
119-50 | 149-08 | 115-40 | 155-43 
119-50 | 150-33 | 116-74 | 155-43 
| 120-44 | 151-61 | 118-21 | 155-815 
121-67 | 152-96 | 119-88 | 157-05 
122-925 | 154-45 | 119-88 | 158-33 
124-23 156-01 | 119-88 | 159-65 
125-61 | 156-01 | 120-55 | 161-075 
127-22 | 156-09 | 121-82 | 163-02 
128-23 | 157-28 | 123-15 | 163-02 
128-23 | 158-49 | 124-57 | 163-24 
128-23 | 159-73 | 126-52 | 164-47 
128-84 160-995 | 126-52 | 165-73 
130-06 | 162-31 | 126-52 | 167-03 
131-30 | 163-725 | 126-945 | 168-41 
132-575 165-68 | 128-20 | 170-00 
133-91 129-50 | 171-09 


| 165-77 











95 99 99-9 

= ) : 
Lower | Upper Lower | Upper | Lower | Upper 
- —___—_ —_—} - r aed 
81-61 | 121-06 76-61 | 128-70 70-66 | 137-22 
83-14 | 122-37 77-15 | 130-278 72-66 | 138-66 
84:57 | 123-77 78-71 | 131-50 72-66 | 140-64 
84:57 | 125-46 80-06 | 131-82 | 73-10 | 140-74 
84-67 | 126-26 80:06 | 133-21 74:96 | 142-10 
86-01 126-48 80-65 | 134-79 75-41 | 143-83 
87-48 127-78 82-21 | 135-99 75°52 | 144-55 
89-23 | 129-14 83-56 | 136-30 77-22 | 145-515 
89-23 130-68 83-56 | 137-68 78-14 | 147-13 
89-23 | 132-03 84-12 | 139-24 78-14 | 148-24 
90-37 | 132-03 85°65 | 140-54 79-53 | 148-92 
91-78 133-148 87-12 | 140-76 80°83 | 150-46 
93-48 | 134-48 87-12 | 142-12 80-83 151-91 
94:23 | 135-92 87-55 | 143-64 81-845 | 152-31 
94-23 | 137-79 89-05 | 145-13 83-50 | 153-79 
94-708 | 137-79 90-72 | 145-19 83-50 | 155-575 
96-06 | 138-49 90-72 146-54 84-17 | 155-69 
97-545 139-79 90-96 | 148-01 86-16 | 157-12 
99-17 | 141-16 92-42 149-76 86-16 | 158-89 
99-17 | 142-70 94-345 | 149-76 86-49 | 159-49 
99-17 | 144-01 94-345 | 150-93 88-14 | 160-45 
100-32 144-01 94-35 152-35% 89-23 | 162-01 
101-71 | 145-08 95-76 | 154-18 89-23 | 163:36 
103-315 | 146-39 97-42 | 154-60 90-34 | 163-79 
104-40 | 147-80 98-36 | 155-31 92-01 | 165-25 
104-40 | 149-53 98-36 | 156-69 92-01 | 167-11 
104-58 | 150-19 99-09 | 158-25 92-59 167-12 
105-90° | 150-36 | 100-61 159-53 94-36 | 168-52 
107-32 151-63 102-16 | 159-67 95-015 | 170-14 
109-11 | 152-96 | 102-165 | 161-01 95-015 | 171-24 
109-61 | 15439 | 102-42 | 162-46 96-40 | 171°81 
109-61 | 156-32 103-84 | 164-31 97-95 | 173-28 
110-11 | 156-32 105-66 164-31 97-95 | 175-08 
111-44 | 156-87 106-12 | 165-33 98-59 | 175-11 
112-87 | 158-15 | 106-12 | 166-71 100-36 | 176-51 
114-84 | 159-48 | 107-10 | 168-29 | 101-02 | 17811 
114-84 | 160-925 | 108-615 | 169-49 | 101-02 | 179-27 
114-84 | 162-79 | 110-16 | 169:64 | 102-34 | 179-76 
115-60° | 162-79 | 110-16 170-98 | 104-01 | 181-21 
116-93 | 163-35 110-37 | 172-41 104-01 | 183-15 
118-35 164-63 111-78 | 174-36 | 104-50 | 18315 
120-36 | 165-96 113-45 | 174-36 | 106-13 | 184-41 
120-36 | 167-39 114-33 | 175-25 | 107-235 | 185-94 
120-36 | 169-33 114-33 | 176-61 107-235 | 187-43 
121-06 | 169-33 114-99 | 178-11 108-17 | 187-65 
122-37 169-80 | 116-44 179-67 110-17 | 189-06 
123-77 171-07 118-33 | 179-67 110-17 | 190-71 
125-46 | 172-385 | 118-33 | 180-84 | 110-34 | 191-67 
126-26 | 173-79 118-33 | 182-22 111-85 | 192-25 
126-26 175-48° | 119-59 113-51 | 193-70 











183-81 














If c is the observed frequency or count and mz, my are the lower and upper 100(1—¢) % 


confidence limits for its expectation, m, then Pr (mz, < m < my)> 1—e. 



















































Table 1 (cont.) 
Confidence coefficient, 100(1—e)% 
80 90 95 99 99-9 

= IF : 4 
pper Lower Upper Lower Upper Lower Upper Lower | Upper Lower | Upper 

er -_|— —— _—_— —_—=WOO—X——_—o ———__—_—— | — — —_ a] ———— J —-- ape ———$_____| 
1°22 151 135°37 | 165-77 | 130-87 | 171-09 | 126-48 | 176-23 | 121-09 | 184-975 | 113-51 | 195-67 
8-66 152 | 137-08 | 166-645 | 132-44 | 171-84 | 127-78 | 176-23 | 122-69 | 185-08 | 113-95 | 195-67 
0-64 153 | 137-08 | 167-84 | 133-66 | 173-08 | 129-14 | 177-48 | 122-69 | 186-40 | 115-52 | 196-84 
0-74 154 | 137-08 | 169-06 | 133-66 | 174-36 | 130-68 | 178-77 | 122-78 | 187-81 | 116-84 | 198-33 
2-10 155 | 137-18 | 170-305 | 133-66 | 175-69 | 132-03 | 180-14 | 124-16 | 189-50 | 116-84 | 200-04 
383 | 156 138-38 | 171-58 | 134-54 | 177-13 | 132-03 181-67 | 125-70 | 190-28 | 117-54 | 200-05 
4-55 157 | 139-605 | 172-92 | 135-81 | 178-96 | 132-03 | 183-05 | 127-07 | 190-615 | 119-21 | 201-42 
51 | 158 140-85 | 174-39 | 137-13 | 178-96 | 133-145 | 183-05 | 127-07 | 191-94 | 120-17 | 202-94 
‘713° |) 159 | 142-14 | 176-06 | 138-55 | 179-18 | 134-48 | 183-86 | 127-31 | 193-36 | 120-17 | 204-43 
18:24 | 160 | 143-49 | 176-06 | 140-54 | 180-405 | 135-92 | 185-13 | 128-70 | 195-19 | 121-12 | 204-59 
48-92 161 145-01 | 176-06 | 140-54 | 181-655 | 137-79 | 186-46 | 130-275 | 195-59 | 122-92 | 205-98 
046 | 162 | 146-45 | 177-14 | 140-54 | 182-94 | 137-79 | 187-89 | 131-50 | 196-13 | 123-46 | 207-55 
191 | 163 146-45 | 178-34 | 140-85 | 184-30 | 137-79 | 189-83 | 131-50 | 197-46 | 123-46 | 208-83 
52°31 164 146-45 | 179-56 | 142-09 | 185-80 | 138-49 | 189-83 | 131-82 | 198-88 | 124-68 | 209-13 
5379 | 165 | 146-66 | 180-80 | 143-37 | 187-30 | 139-79 | 190-21 | 133-21 | 200-84 | 126-67 | 210-52 
55°57 | 166 | 147-86 | 182-08 | 144-71 | 187-30 | 141-16 | 191-46 | 134-79 | 200-94 | 126-67 | 212-12 
55°69 | 167 | 149-08 | 183-42 | 146-16 | 187-70 | 142-70 | 192-76 | 135-99 | 201-62 | 126-75 | 213-26 
5712) 168 | 150-33 | 184-89 | 147-94 | 188-93 | 144-01 | 194-115 | 135-99 | 202-94 | 128-21 | 213-64 
58:89 | 169 | 151-61 | 186-565 | 147-94 | 190-18 | 144-01 | 195-63 | 136-30 | 204-36 | 130-14 | 215-04 
5949 | 170 | 152-96 | 186-565 | 147-94 | 191-47 | 144-01 | 197-09 | 137-68 | 206-19 | 130-14 | 216-665 
60-45 | im | 154-45 | 186-565 | 148-35 | 192-83 | 145-08 | 197-09 | 139-24 | 206-60 | 130-26 | 217-72 
6201 | 172 | 156-01 | 187-59 | 149-60 | 194-36 | 146-39 | 197-78 | 140-54 | 207-08 | 131-72 | 218-14 
63:36 { 173 156-01 | 188-78 | 150-88 | 195-73 | 147-80 | 199-04 | 140-54 | 208-40 | 133-64 | 219-54 
63:79 | 174 | 156-01 | 189-99 | 152-21 | 195-73 | 149-53 | 200-35 | 140-76 | 200-81 | 133-64 | 221-16 
65:25 | 175 | 156-09 | 191-23 | 153-67 | 196-18 | 150-19 | 201-73 | 142-12 | 211-50 | 133-75 | 222-22 
6711 | 176 | 157-28 | 192-49 | 155-43 | 197-40 | 150-19 | 203-355 | 143-64 | 212-29 | 135-20 | 222-63 
6712 | 177 158-49 | 193-81 | 155-43 | 198-655 | 150-36 | 204-36 | 145-13 | 212-53 | 137-15 | 224-02 
68°52 | 178 | 159-73 | 195-22 | 155-43 | 199-94 | 151-63 | 204-36 | 145-13 | 213-84 | 137-15 | 225-62 
7014 | 179 | 160-995 | 197-12 | 155-815 | 201-30 | 152-96 | 205-315 | 145-19 | 215-22 | 137-22 | 226-76 
7124 | 180 | 162-31 197-33 | 157-05 | 202-80 | 154-39 | 206-58 | 146-54 | 216-80 | 138-66 | 227-095 
71-81 |) 181 | 163-725 | 197-33 | 158-33 | 204-30 | 156-32 | 207-90 | 148-01 | 217-98 | 140-64 | 228-48 
173-28 | 182 | 165-68 | 197-97 | 159-65 | 204-30 | 156-32 | 209-30 | 149-76 | 217-98 | 140-74 | 230-04 
175-08 | 183 165-77 | 199-16 | 161-075 | 204-62 | 156-32 | 211-03 | 149-76 | 219-25 | 140-74 | 231-34 
175-11 | 184 | 165-77 | 200-36 | 163-02 | 205-84 | 156-87 | 211-69 | 149-76 | 220-61 | 142-10 | 231-55 
176-51 | 185 | 165-77 | 201-58 | 163-02 | 207-08 | 158-15 | 211-69 | 150-93 | 222-105 | 143-83 | 232-92 
17811 | 186 | 166-645 | 202-825 | 163-02 | 208-36 | 159-48 | 212-82 | 152-355 | 223-675 | 144-55 | 234-435 
179-27 \ 187 167-84 | 204-11 | 163-24 | 209-69 | 160-925 | 214-09 | 154-18 | 223-675 | 144-55 | 235-94 
179-76 | 188 | 169-06 | 205-45 | 164-47 | 211-13 | 162-79 | 215-40 | 154-60 | 224-65 | 145-515 | 235-99 
18121 | 189 170-305 | 206-94 | 165-73 | 212-965 | 162-79 | 216-81 | 154-60 | 225-98 | 147-135 | 237-34 
18315 | 190 171-58 | 208-52 | 167-03 | 212-965 | 162-79 | 218-56 | 155-31 | 227-41 | 148-24 | 238-82 

. | | 
183-15 | 191 172-92 | 208-52 | 168-41 | 213-03 | 163-35 | 219-16 | 156-69 | 229-37 | 148-24 | 240-56 
18441] 192 174-39 © 208-52 | 170-00 | 214-24 | 164-63 | 219-16 | 158-25 | 229-37 | 148-92 | 240-56 
18594 | 193 176-06 | 209-49 | 171-09 | 215-47 | 165-96 | 220-29 | 159-53 | 230-03 | 150-46 | 241-75 
187-45 | 194 176-06 | 210-675 | 171-09 | 216-73 | 167-39 | 221-56 | 159-53 | 231-33 | 151-91 | 243-19 
187-65 | 195 176-06 | 211-88 | 171-09 | 218-03 | 169-33 | 222-865 | 159-67 | 232-71 | 151-91 | 245-18 
189-06 | 196 176-06 | 213-10 | 171-84 | 219-405 | 169-33 | 224-26 | 161-01 | 234-28 | 152-31 | 245-18 
190-71 | 197 177-14 | 214-35 | 173-08 | 221-00 | 169-33 | 225-905 | 162-46 | 235-50 | 153-79 | 246-15 
= 198 178-34 215-63 | 174-36 | 222-10 | 169-80 | 226-81 | 164-31 | 235-50 | 155-575 | 247-55 
19225 | 199 179-56 | 216-985 | 175-69 | 222-10 | 171-07 | 226-81 | 164-31 | 236-68 | 155-575 | 249-18 
193-70 | 200 180-80 | 218-49 | 177-13 | 222-60 | 172-385 227-73 | 164-31 | 238-01 | 155-69 | 250-20 
i | | 

















































































Table 1 (cont.) 


Confidence coefficient, 100(1—e) % 









































80 90 95 99 99-9 
ee ae : : 
Lower | Upper | Lower | Upper | Lower | Upper | Lower | Upper | Lower | Upper 
| e we sai seap | x =. 

182-08 | 219-96 | 178-96 | 223-82 | 173-79 | 228-99 | 165-33 | 239-46 | 157-12 | 250-54 
183-42 | 219-96 | 178-96 | | 225-05° | 175-485 | 230-28 | 166-71 | 241-32 | 158-89 | 251-91 
| 184-89 | 219-96 | 178-96 | 226-33 176-23 | 231-65 | 168-29 | 241-32 | 159-49 | 253-42 
| 186-56° | 220-95 | 179-18 | 227-65 | 176-23 | 233-19 | 169-49 | 242-01 | 159-49 | 254-96 
186-565 | 222-13 | 180-40° | 229-07 176-23 | 234-53 | 169-49 | 243-315 | 160-45 | 254-96 
| 186-56° | 223-33 | 181-65° | 231-02 177-48 | 234-53 | 169-64 | 244-69 | 162-01 | 256-26 
| 186-56° | 224-55 | 182-94 | 231-02 | 178-77 | 235-145 | 170-98 | 246-24 | 163-36 | 257-70 
187-59 | 225-79 | 184-30 | 231-02 | 180-14 | 236-39 | 172-41 | 247-545 | 163-36 | 259-67 
| 188-78 | 227-07 | 185-80 | 232-14 | 181-67 | 237-67 | 174-36 247-545 | 163-79 | 259-67 
189-99 | 228-40 | 187-30 | 233-355 | 183-05 | 239-00 | 174-36 | 248-62 | 165-25 260-62 
191-23 229-85 | 187-30 | 234-60 | 183-05 | 240-45 | 174-36 | 249-94 | 167-11 | 262-00 
192-49 | 231-60 | 187-30 | 235-88 | 183-05 | 242-27 | 175-25 | 251-35 | 167-11 263-575 
193-81 | 231-60 | 187-70 | 237-21 183-86 | 242-27 | 176-61 | 253-14 | 167-12 | 264-805 
195-22 | 231-60 | 188-93 | 238-66 | 185-13 | 242-53 | 178-11 | 253-65 | 168-52 | 264-965 
197-12 | 232-36 | 190-18 | 240-43 | 186-46 | 243-76 | 179-67 | 253-92 | 170-14 | 266-31 
197-33 | 233-53 | 191-47 | 240-43 | 187-89 | 245-02 | 179-67 | 255-20 | 171-24 | 267-765 
197-33 | 234-72 | 192-83 | 240-43 | 189-83 | 246-325 | 179-67 256-54 | 171-24 | 269-61 
197-33 = 235-93 | 194-36 | 241-63 | 189-83 | 247-70 | 180-84 | 258-00 | 171-81 | 269-61 
197-97 | 237-16 | 195-73 | 242-85 | 189-83 | 249-28 | 182-22 | 259-78 | 173-285 | 270-63 
199-16 | 238-415 | 195-73 | 244-09 | 190-21 | 250-43 | 183-81 | 259-78 | 175-08 | 272-01 
200-36 | 239-715 | 195-73 | 245-37 | 191-46 | 250-43 | 184-975 | 260-47 | 175-08 | 273-60 
201-58 | 241-09 | 196-18 | 246-70 | 192-76 | 251-11 184-975 | 261-77 75-11 | 274-78 
202-825 | 242-69 | 197-405 | 248-15 | 194-115 | 252-35 | 185-08 | 263-125 | 176-51 | 274-95 
204-11 | 243-76 | 198-655 | 249-94 | 195-63 | 253-63 | 186-40 | 264-63 | 178-11 276-29 
205-45 243-76 | 199-94 | 249-94 | 197-09 | 254-95 | 187-81 | 266-15 | 179-27 | 277-73 
206-94 | 243-76 | 201-30 | 249-94 | 197-09 | 256-37 | 189-50 | 266-15 ©; 179-27 | 279-64 
208-52 | 244-885 | 202-80 | 251-09 | 197-09 | 258-34 | 190-28 267-01 179-76° | 279-64 
208-52 | 246-065 | 204-30 | 252-305 | 197-78 | 258-34 | 190-28 | 268-31 181-21 | 280-57 
208-52 | 247-26 | 204-30 | 253-54 | 199-04 | 258-45 | 190-615 | 269-68 | 183-15 | 281-94 
208-52 | 248-47 | 204-30 | 25481 | 200-35 | 259-67 | 191-94 271-22 | 183-15 283-47 
209-49 | 249-70 | 204-62 | 256-13 | 201-73 | 260-92 | 193-36 | 272-56 | 183-15 | 284-91 
| 210-675 | 250-96 | 205-84 | 257-55 | 203-355 | 262-20 | 195-19 | 272-56 | 184-41 | 284-91 
| 211-88 | 252-27 | 207-08 | 259-50 | 204-36 | 263-54 | 195-59 | 273-53 | 185-94 286-19 
213-10 | 253-67 | 208-36 | 259-60 | 204-36 | 265-00 | 195-59 | 274-83 | 187-43 | 287-60 
214-35 | 255-39 | 209-69 | 259-60 | 204-36 | 266-71 196-13 | 276-20° | 187-43 | 289-30 
215-63 | 256-07 | 211-13 | 260-51 | 205-315 | 266-71 197-46 | 277-77 | 187-65 | 290-09 
216-98° | 256-07 | 212-96° | 261-72 | 206-58 | 266-97 | 198-88 | 279-015 | 189-06 | 290-46° 
218-49 | 256-19 | 212-965 | 262-95 | 207-90 | 268-19 | 200-84 | 279-015 | 190-71 | 291-81 
219-96 | 257-36 | 212-965 | 264-20 | 209-30 | 269-44 | 200-94 | 280-02 | 191-67 | 293-26 
219-96 | 258-54 | 213-03 | 265-50 | 211-03 | 270-73 | 200-94 281-32 | 191-67 | 295:12 
219-96 | 259-73 | 214-24 | 266-87 | 211-69 | 272-08 | 201-62 | 282-70 | 192-25 | 295-12 
219-96 | 260-94 | 215-47 | 268-42 | 211-69 | 273-57 | 202-94 | 284-25 | 193-70 | 296-05 
220-95 | 262-17 | 216-73 | 269-67 | 211-69 | 275-15 | 204-36 | 285-53 | 195-67 | 297-41 
| 222-13 | 263-425 | 218-03 | 269-67 | 212-82 | 275-15 | 206-19 285-53 | 195-67 | 298-91 
| 223-33 | 264-73 | 219-405 | 269-90 | 214-09 | 275-46 | 206-60 | 286-50 | 195-67 | 300-46 
224-55 | 266-11 221-00 | 271-10 | 215-40 | 276-69 | 206-60 | 287-79 | 196-84 | 300-46 
225-79 | 267-73 | 222-10 | 272-315 | 216-81 | 277-94 | 207-08 | 289-16 | 198-33 301-62 
227-07 | 268-73 | 222-10 | 273-55 | 218-56 | 279-22 | 208-40 | 290-68 | 200-04 | 303-00 
228-40 | 268-73 | 222-10 | 274-82 | 219-16 | 280-57 | 209-81 | 292-10 | 200-04 | 30457 
229-85 | 268-73 | 222-60 | 276-14° | 219-16 | 282-05 | 211-50 | 292-10 | 200-05 | 305°81 





If c is the observed frequency or count and mz, my are the lower and upper 100(1—¢€) % 
confidence limits for its expectation, m, then Pr (mz < m < my)> 1—e. 
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Upper 


250-54 
251-91 
253-42 
254-96 
254-96 
256-26 
257-70 
259-67 
259-67 
260-62 
262-00 
263-575 
264-805 
264-968 
266-31 





267-76 
269-61 
269-61 
270-63 
272-01 


273-60 
274-78 
274-95 
276-29 
277-73 
279-64 
279-64 
280-57 
281-94 
283-47 
284-91 
284-91 
286-19 
287-60 
289-30 
290-09 
290-46" 
291-81 
293-26 
295-12 
295-12 
296-05 
297-41 
298-91° 


—— 





300-46 
300-46 
301-62 
303-00 
30457 
305°81 














Table 1 (cont.) 


Confidence coefficient, 100(1—«) % 





















































80 90 95 99 99-9 

Lower | Upper | Lower | Upper | Lower | Upper | Lower | Upper | Lower | Upper 
231-60 | 269-79 | 223-82 | 277-57 | 219-16 | 283-67 | 212-29 | 292-95 | 201-42 | 305-86 
231-60 | 270-96 | 225-055 | 279-53 | 220-29 | 283-67 | 212-29 | 294-24 | 202-94 | 307-18 
231-60 | 272-145 | 226-33 | 279-53 | 221-56 | 283-93 | 212-53 | 295-59 | 204-43 | 308-58 
231-60 | 273-345 | 227-65 | 279-53 | 222-865 | 285-15 | 213-84 | 297-07 | 204-43 | 310-23 

| 232-36 | 274-56 | 229-07 | 280-45 | 224-26 | 286-40 | 215-22 | 298-71 | 204-59 | 311-16 

| 233-53 | 275-81 | 231-02 | 281-65 | 225-905 | 287-68 | 216-80 | 298-71 | 205-98 | 311-395 
234-72 | 277-09 | 231-02 | 282-87 | 226-81 | 289-01 | 217-98 | 299-39 | 207-55 | 312-72 
235-93 | 278-43 | 231-02 | 284-12 | 226-81 | 290-46 | 217-98 | 300-67 | 208-83 | 314-13 | 
237-16 | 279-90 | 231-02 | 285-40 | 226-81 | 292-26 | 217-98 | 302-00 | 208-83 | 315-89 | 
238-415 | 281-56 | 232-14 | 286-735 | 227-73 | 292-26 | 219-25 | 303-43 | 209-13 | 316-50 | 
239-715 | 281-56 | 233-355 | 288-20 | 228-99 | 292-37 | 220-61 | 305-35 | 210-52 | 316-92 
241-09 | 281-56 | 234-60 | 289-90 | 230-28 | 293-59 | 222-105 | 305-35 | 212-12 | 318-25 
242-69 | 282-16 | 235-88 | 289-90 | 231-65 | 294-825 | 223-675 | 305-81 | 213-26 | 319-67 
243-76 | 283-33 | 237-21 | 289-90 | 233-19 | 296-09 | 223-675 | 307-07 | 213-26 | 321-56 | 
243-76 | 284-51 | 238-66 | 290-96 | 234-53 | 297-41 | 223-675 | 308-38 | 213-64 | 321-83 
243-76 | 285-695 | 240-43 | 292-16 | 234-53 | 298-81 | 224-65 | 309-775 | 215-04 | 322-43 
243-76 | 286-90 | 240-43 | 293-385 | 234-53 | 300-56 | 225-98 | 311-41 | 216-665 | 323-76 
244-885 | 288-12 | 240-43 | 294-63 | 235-145 | 301-16 | 227-41 | 312-38 | 217-72 | 325-19 
246-065 | 289-375 | 240-43 | 295-91 | 236-39 | 301-16 | 229-37 | 312-38 | 217-72 | 327-19 
247-26 | 290-67 | 241-63 | 297-26 | 237-67 302-00 | 229-37 | 313-46 | 218-14 | 327-19 
248-47 | 292-03 | 242-85 | 298-735 | 239-00 | 303-22 | 229-37 | 314-755 | 219-54 | 327-92 
249-70 | 293-57 | 244-09 | 300-36 | 240-45 | 304-48 | 230-03 | 316-11 | 221-16 | 329-25 
250-96 | 294-88 | 245-37 | 300-36 | 242-27 | 305-77 | 231-33 | 317-60 | 222-22 | 330-68 
252-27 | 294-88 | 246-70 | 300-36 | 242-27 | 307-13 | 232-71 | 319-19 | 222-22 | 332-60 
253-67 | 294-88 | 248-15 | 301-44 | 242-27 | 308-645 | 234-28 | 319-19 | 222-63 | 332-79 

| 255-39 | 295-66 | 249-94 | 302-64 | 242-53 310-07 | 235-50 | 319-84 | 224-02 | 333-40 
256-07 | 296-82 | 249-94 | 303-86 | 243-76 310-07 | 235-50 | 321-11 | 225-62 | 334-73 
256-07 | 298-00 | 249-94 | 305-10 | 245-02 | 310-38 | 235-50 | 322-48 | 226-76 | 336-14 
256-07 | 299-185 | 249-94 | 306-38 | 246-325 | 311-60 | 236-68 | 323-84 | 226-76 | 337-92 

| 256-19 | 300-39 | 251-09 | 307-71 | 247-70 | 312-835 | 238-01 | 325-58 | 227-095 | 338-47 
257-36 | 301-61 | 252-305 | 309-16 | 249-28 | 314-10 | 239-46 | 326-21 | 228-48 | 338-87 
258-54 | 302-86 | 253-54 | 310-94 | 250-43 | 315-42 | 241-32 | 326-21 | 230-04 | 340-19 

| 259-73 | 304-15 | 254-81 | 310-94 | 250-43 | 316-83 | 241-32 | 327-46 | 231-34 | 341-59 

| 260-94 | 305-51 | 256-13 | 310-94 | 250-43 | 318-63 | 241-32 | 328-75 | 231-34 | 342-26 
262-17 | 307-03 | 257-55 | 311-88 | 251-11 | 319-09 | 242-01 | 330-10 | 231-55 | 344-12 
263-425 | 308-42 | 259-50 | 313-07 | 252-35 | 319-09 | 243-315 | 331-59 | 232-92 | 344-33 
264-73 308-42 | 259-60 | 314-29 | 253-63 | 319-95 | 244-69 | 333-20 | 234-43° | 345-64 
266-11 | 308-42 | 259-60 | 315-52 | 254-95 | 321-17 | 246-24 | 333-20 | 235-94 | 347-02 
267-73 | 309-10 | 259-60 | 316-785 | 256-37 | 322-42 | 247-545 | 333-80 | 235-94 | 348-605 
268-73 | 310-26 | 260-51 | 318-10 | 258-34 | 323-70 | 247-545 | 335-065 | 235-99 | 349-78 

268-73 «| 311-43 | 261-72 | 319-50 | 258-34 | 325-04 | 247-545 | 336-37 | 237-34 | 349-78 

| 268-73 | 312-62 | 262-95 321-24 | 258-34 | 326-50 | 248-62 | 337-76 | 238-82 | 351-07 
268-73 | 313-815 | 264-20 | 321-86 | 258-45 | 328-21 | 249-94 | 339-38 | 240-56 | 352-43 

| 269-79 315-03 | 265-50 | 321-86 | 259-67 | 328-21 | 251-35 | 340-41 | 240-56 | 353-95 

| 270-96 | 316-27 | 266-87 | 322-29 | 260-92 | 328-28 | 253-14 | 340-41 | 240-56 | 355-43 

| 272-145 | 317-54 | 268-42 | 323-475 | 262-20 | 329-49 | 253-65 | 341-38 | 241-75 | 355-43 
273-345 | 318-87 | 269-67 | 324-68 | 263-54 | 330-72 | 253-65 | 342-65 | 243-19 | 356-49 
274-56 | 320-315 | 269-67 | 325-90 | 265-00 | 331-97 | 253-92 | 343-98 | 245-18 | 357-835 | 
275-81 | 322-14 | 269-67 | 327-15 | 266-71 | 333-26 | 255-20 | 345-41 | 245-18 | 359-29 | 

| 277-09 | 322-15 | 269-90 | 328-435 | 266-71 | 334-62 | 256-54 | 347-375 | 245-18 | 361-085 
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THE DISTRIBUTION OF THE NUMBER OF SUCCESSES 
IN A SEQUENCE OF DEPENDENT TRIALS 


By K. R. GABRIEL 


Hebrew University, Jerusalem and University of North Carolina 


1. INTRODUCTION 


Consider a sequence of n trials and an initial trial whose outcomes are the random variables 
X,, Xo, ...,X, and X, respectively, where 


( =1 if the kth trial is successful 
XxX; 
\ =0 otherwise, 
for k = 0,1, 2,...,n. Let the probability of a success at the initial trial be 
R = Pr{X, = I}, 
and assume the conditional probabilities, which are denoted by 
p, = Pr{X, = 1|X,4 = Y, 


for k = 1,2,...,, to be independent of k and of the outcomes of the trials preceding the 
(k—1)st. 

This sequence is an irreducible Markov Chain with two ergodic states. Its stationary 
probability distribution has as probability of a success 


These results follow from the general theory of Markov Chains (Feller, 1957, chapters xv 
and xvi.2, example (a)). 
The recurrence time distribution of successes has probabilities 


Py, (1—p3) Po, (1—P1)(1—Po) Mo, (1—p1)(1—D 9)? Mo, «+, 


for one, two, three, four, ... trials up to and including the next success. Hence the mean and 
variance of recurrence times are found to be 


1 —(p,— Po) 2 1-Pr 
4a ——-<—, ot= 1+ x 
-= Pe pe (1+, —Dpo) 





Now consider the number of successes in the sequence, i.e. 


S = DX, 
k= 
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It is known that S is asymptotically normally distributed with mean n/w and variance 
no?/u (Feller, 1957, chapter x11.6; Smith, 1958). Substituting for ~ and o?, one obtains 


B(S) ~~ mP, Var(S) ~ nP(—P)+*5, 
where d = p, — pp. This is an asymptotic result which indicates neither the exact distribution 
for small n nor the rapidity of approach to normality. It seems to have been given first by 
Markov (1924, § 60). 


2. THE EXACT DISTRIBUTION 


If S successes occur in n trials there will be a number of changes from success on one trial 
(including the initial trial) to failure on the next, and vice versa. Denote the number of 
changes by C and let a and 6b stand for the least integer not smaller than $C —1 or 40 
respectively. 

Consider first the case of a success at the initial trial, then S successes with C changes will 
involve b changes from success to failure, a changes from failure to success, as well as S—a 
successes following successes and n—S—6 failures following failures. The probability of 
any one arrangement of S successes and n —S failures with C changes is therefore 


(1—p,)° p§ p?-*(1 — po)”-S—, 


: = 6 a 
. rncni=B) GY 


Next, any arrangement of n trials with S successes and C changes will involve a changes to 
; ar a\ .. 
successes which may occur before any a of the S successes, i.e. in any of (") different pos- 


sible positions. Also 6 changes occur before failures, of which the first must occur before 
n—S—1 
b-1 
the arrangements of both kinds of changes, the position of all changes in the sequence is 
uniquely determined, and therefore the total number of possible arrangements of C changes 
n—S—1 
b-1 
with C changes in v trials following a successful initial trial is 


a ie S\ (n-—S-1 ait n—S ra)’ (Pe)" 
Pr{S,C|n, X= 1} = (7) b-1 ) ofa Po) ea P,) * 


the first failure and the rest may be arranged in any of ( different ways. Given 


among n trials with S successes is (") ( . Hence the probability of S successes 


The number of changes C may be any number from | to n+4—|2S—}+n| =(,, say, 
except if S = n when C, = 0. Thus the probability of S successes in n trials following a 
successful initial trial is obtained by summing the above probabilities over all possible 
values of C, 


Pr{S | Xo= 1) = afl" (<)("5—1 )GE a) (Gl) 


c=1 \a 1— pp Dy 
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In a similar manner for the case of a failure on the initial trial 


=o =ata—mr2 3 (6a) (ea) (pl) 
Pr {S |n, Xo = 0} = p7t(1— po) Py | e I—p, )? 

where C, = n+ 4—|2S—4-—n|, except for S = 0 when C, = 0 and the summation above is 
understood to include only the term C = 0. 

As one would expect, when d = 0, i.e. in the case of independent trials, either probability 
becomes that of the binomial distribution. 

Finally, given the probability R of a success at the initial trial, the unconditional pro- 
bability of S successes in the sequence of n trials becomes 


Pr {S| n} = R.Pr{S |n, Xq = 1}+(1—R).Pr {8 | n, X_ = 0}. 





This form of the distribution is rather unwieldy but does not seem to lend itself to sim- 
plification. It may be easier to obtain an impression of the closeness to normality by means 
of the moments. 


3. MOMENTS OF THE DISTRIBUTION 


For a sequence of n dependent trials the rth factorial moment of the number of successes S is 


rid,=r! DY Pr{X, =1, X;, =1,..., X;, = Uj, 


4, <ig<... <4, 
where 7,,%9,...,%, is any subset of r of the integers 1, 2,...,2 (Gumbel, 1938, table 1 and 
formula 17). Putting k; = i;—7;_, (j = 2,...,7) and k, = 7, this becomes 
n 
rid, =r! DO Pr{X,, = 1, Xen, = 1, «+s Xagetgt...the = UD, 


n 
where }” denotes summation over all sets of r positive integer suffices (,, ..., k,,.) whose 
sum is < n. For a Markov Chain this becomes 


r! >” Pr {X;., spi 1} Pr {X hs tke =1 | Xi, 5" 1} gt Pr {Xi tiegt..nthy =1 . rae = 1}. 
Now Pr{X,, = 1| X,_; = y} = P(l—d*) +-d* Pr (X,,_,,, = 1|X,, = y} 
for k = 1,2,...,.n andi = 1,2,...,k (Feller, 1957, p. 385). In particular, 
Pr {Xi ket... +hj = 1 | X hatha oothina = 1} = P+ (1 —P) dk 
and Pr{X,, = } = P+(R—-P)d%. 


Introducing these probabilities into the expression for the factorial moments, and writing 
f = 1/(1—d), one obtains 


d, = nP +(R—P)df(l—d"), 
b. = pees PU + R—2P) df{n(1 —d) —1(1—d")]+(1—P) (R—P) df*{d(1—d") 
—nd"(1—d)}, 
n* — 3n* + 2n 


by = pom SET" 5 P24 R—SP) hdf%{n*%(1 —d)*—n(1 —d) (3—d) + 2(1 -d%)] 


+ P(1—P)(1+2R—3P) df*{nd(1 —d) —2d(1 —d") + nd™(1—d)] 
+(R—P) (1—P)?df%[d*(1 —d”) — 4n2d"(1 —d)? + 4nd"(1 —d) (1 —34)], 
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nt —6n? + 11n? — 
24 


+n(1—d) (11—7d + 2d?) — 6(1 —d”)] + 3P°(1 —P) (14+. R—2P) 

x fdf*[n2d(1 —d)? — nd(1 —d) (5 —d) — 2nd(1 —d) + 6d (1 —d”)] 
+P(1—P)?(1+3R —4P) df*{nd2(1 —d) —3d2(1 —d") + 4n2d"(1 —d)? 

~4nd"(1—d) (1—5d)] + (R—P) (1—P) df[d9(1 —d”) —4nd(1 —d) (2—7d + 11d?) 
+ 4n?d”(1 —d)? (1 — 2d) —An8d"(1 —d)?}. 





6, =P! 6” | P(3 + R—4P) Ldf4[n¥(1 —d)>— 3n%(1 —d)? (2—d) 


These expressions are unwieldy and their substitution into the formulae for the moments 
about the mean (Kendall, 1948, chapter 3) does not yield simpler forms. In numerical 
work it would appear advisable to compute the ¢ values first and then obtain the other 
moments from them. 

Again, for d = 0 these moments become those of the binomial distribution. 

In the special case R = P, whatever d, i.e. in a stationary Markov Chain, the first three 
moments are: 


My = oP, 
Mg = P(L—P) [n(1 +d) f—2df%(1—d")], 
fg = P(1—P) (1—2P) [n{1 + 6df2(1 +.d”)} — 6d(1 + d) f3(1 —d”)]. 
For large, n, approximations may be obtained by ignoring terms in d” and terms not 


involving n.* Then 
fy = mP, 


l+d 
be © nP(1 —P)7—}: 


_ 1 1-2P —14+4d+d? — 
%1 © Jn \[P(.—P)] (1 +d) J —a)’ 





_ 11—6P(1—P) 1+10d+d? 
ie 


n P(—P) l-@”’ 





where terms involving (R — P) and n- or powers of n—! are ignored, and y,, y2 are the usual 
standardized 3rd and 4th cumulants. 

At this level of approximation the cumulants are independent of 2, i.e. for a sufficiently 
long sequence the outcome of the initial trial is immaterial. /., y, and y, can be factorized 
into the corresponding term for the binomial distribution and a term depending on d. 
For large n, y, and y, will vanish whilst ~, and j, correspond to the mean and variance in 
the asymptotic normal distribution (§1). For any n, P and d, the approximations to y, 
and y, may give some indication of the deviation from normality. 


* [By introducing the moments of the recurrence time distribution, these large sample approxima- 
tions to the moments of S can presumably be related to the results of W. L. Smith, whose paper 
(Biometrika, 1959, 46, 1-29) ‘On the cumulants of renewal processes’ was published after the present 
paper had been received for publication. It should be noted, however, that while Smith’s results were 
established for the continuous case, the present author deals with a discrete chain.—Eb.] 


29 Biom, 46 
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4. NUMERICAL CALCULATIONS AND AN APPLICATION 


An idea of the form of the distribution of § is given by the cumulants as computed for 
Tables 1 and 2. The exact computations are cumbersome and have therefore been restricted 
to a few selected values of the parameters. It will be noted that for values of P = 0-2 and 
0-4, 4, and y, will be as for P = 0-8 and 0-6, respectively, y, will change sign, and the mean 
will be m minus the mean for P = 0:8 and 0-6. 


Table 1. Cumulants of the distribution of S in sequences of seven trials 



























































P 0-6 0-8 
d 0 0-2 0-4 0-8 0 0-2 0-4 0-8 
My 42 42 4-2 4-2 5:6 5-6 56 5-6 
Exact 1-68 2-37 3-39 7-53 1-12 1-58 2:26 5-02 
2 \ Approx. 1-68 2-52 3-92 15-12 1-12 1-68 2-61 10-08 
aie —0-154 | —0-228 | —0-294 | —0-394 | —0-567 | —0-837 | —1-081 | —1-447 
"1 Approx. | —0-154 | —0-241 | —0-332 | —0-691 | —0-567 | —0-887 | —1-219 | —2-540 
sae —0-261 | —0-442 | —0-690 | —1-425 0-036 0-380 0-686 0-709 
Ya Approx. | —0-261 | —0-828 | —1-606 | —7-001 0-036 0-113 0-219 0-956 
Table 2. Cumulants of the distribution of S in sequences of thirty trials 
¥ 0-6 0-8 | 
d 0 | 0-2 0-4 0-8 0 0-2 0-4 0-8 
| | 
My is =| 18 18 18 24 24 24 24 
no 7-20 | 10-65 16-27 55-45 4-80 7:10 10-84 36-81 
2 \ Approx. 720 | 10-80 16-80 64-80 4-80 7-20 11-19 43-20 
oe, -—0-075 | —0-115 | —0-157 | —0-297 | —0-274 | —0-423 | —0-576 | —1-099 
1 \Approx. | —0-075 | —0-117 | —0-160 | —0-334 | —0-274 | —0-429 | —0-589 | —1-227 | 
pon —0-061 | —0-109 | —0-181 | —0-684 0-008 0-105 0-227 0-771 | 
Ya Approx. | —0-061 | —0-193 | —0-375 | —1-633 0-008 0-026 0-051 0-223 
} | 





























For d = 0 the distribution is binomial, as was noted. With increasing d, the variance, 
skewness and kurtosis increase considerably. Changes in n and P affect the moments much 
as they do those of the binomial. This regularity holds only if d” is negligible, e.g. for d = 8 
and ” = 7 the pattern is disrupted. 

The closeness of the approximations seems to depend largely on d being small and n 
large. The approximations are a good deal better for ~, and y, than for y,. 

The distribution of S is illustrated by an application to rainfall data of Tel Aviv, where 
it has been suggested that a Markov Chain probability model holds* (rainy and dry days 

* The fit of a Markov Chain probability scheme to daily rainfall data for Tel Aviv will be discussed 


in a forthcoming publication by K. R. Gabriel and J. Neumann. 
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being considered as ‘successes’ and ‘failures’, respectively). The estimated probabilities 
for January are p, = 0-674 and p, = 0-293. 
The exact cumulants are therefore 


My = 14-663, fg = 16-744, y,=+0-040, y, = —0-195, 
while the large sample approximate value of the variance is 
My = nP(1—P) (1+d)/(1—d) = 17-240. 


Table 3 shows that for these parameter values the normal approximation using the 
large sample variance is very close to the discrete exact distribution. Observed data are 
available for only 27 years and are therefore not presented in the table. 


Table 3. Exact distribution and asymptotic normal distribution, the comparison being made 
for the values n = 31, P = 0-473, d = 0-381 (suggested by data of January rainfall at 
Tel Aviv) 


























| 

Probability Probability Probability 

| No. of No. of No. of 

‘successes successes successes 

| Exact Approx. Exact Approx. Exact Approx. 

| 

| 0 0-0000 0-0003 

| 1 0-0001 0-0004 11 0-0671 0-0651 21 0-0306 0-0301 

| 2 0-0004 0-0009 12 0-0794 0-0782 22 0-0206 0-0202 

| 3 0-0010 0-0019 13 0-0890 0-0886 23 0-0129 0-0129 

Fi 0-0027 0-0036 14 0-0942 0-0947 24 0-0075 0-0078 

| 5 0-0057 0-0065 15 0-0946 0-0956 25 0-0040 0-0044 

| 6 0-0105 0-0110 16 0-0899 0-0907 26 0-0020 0-0023 

7 0-0177 0-0176 7 0-0810 0-0819 27 0-0009 0-0012 

8 0-0275 0-0266 18 0-0692 0-0696 28 0-0003 0-0006 

ee. 0-0396 0-0380 19 0-0558 0-0557 29 | 0-0001 0-0002 

| 10 0-0531 0-0510 20 0-0426 0-0422 30 0-0000 0-0001 
31 0-0000 0-0001 























Note on computation 
Computation of the exact probability distribution is rather tedious. It involves evalua- 
tion of the n(n + 1) +2 terms inside the summation signs of the expressions for 


Pr{S|n,Xy=1} and Pr{S|n, Xo = 0}. 


This is best done with the help of tables of logarithms of binomial coefficients (Hald, 1952, 
table XIV). I am happy to acknowledge Mrs R. Falk’s precise work in computing the 
distribution of Table 3. 


I wish to thank Profs. W. J. Hall and W. Hoeffding for their helpful suggestions. 
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Editorial note added in proof 


The reader may care to note that a paper by P. V. Krishna Iyer and N. S. Shakuntala 
on ‘Cumulants of some distributions arising from a two-state Markoff chain’, which has 
appeared since the present paper was sent to press (1959, Proc. Camb. Phil. Soc. 55, 273-6) 
covers some ground in parallel with that covered by Mr Gabriel. E. 8. P. 
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TABLE OF THE UPPER 10% POINTS OF THE 
STUDENTIZED RANGE 


By J. PACHARES 
Hughes Aircraft Company, Culver City, California 


1. INTRODUCTION 

Let ¢ = w/s be the familiar ‘studentized’ range when sampling from a normal population, 
where w is a sample range of values and s is a sample estimate of standard deviation based 
on v degrees of freedom and distributed independently of w. If F(q) = Pr (w/s < q), then 
the values of g for which F(q) = 1—a have been computed to three significant figures for 
a = 0:10, m = 2(1) 20 and v = 1(1) 20, 24, 30, 40, 60, 120, oo. The values of g when a = 0-05 
were also recomputed for checking purposes with the result that several minor disagree- 
ments with Pearson & Hartley’s table 29 (1954) were noted.* The need for 10% points as 
well as more extreme points was originally suggested by Prof. Henry Scheffé (1953, p. 91) 
in his paper discussing methods for judging all contrasts in the analysis of variance. 


2. FoRMULAE AND NOTATION 
Let z(a) = (27)-te-7*2, E(t) = [ew dx, c, = 2{}./(mv)p/T(4v) 
0 


and let P,(t) be the cumulative distribution function of the range in samples of n from a 
N(0,1) population. Using the form given by May (1952), we have that 


1- Fg) =<, ‘ f4(t/q) 2(t/q)}" (1—_P,(t)) tak, (1) 


where P(t) = nfo 2(u) (H(u+t) —H(u))"— du. (2) 


—o 


An alternate form for P,(t) derived by Hartley (1942) is given by 
P,(t) = (2E(}t))" +2n I, z(u) (E(u) — E(u —t))"*dw. (3) 
t 
Ifn = 2 it is easy to show that 


2 ‘are tan (q/¥ (2v)) 
F(q) = i (cos u)’—1 du. (4) 
0 


~ -Bit, dy) 


3. METHOD OF COMPUTATION 


A subroutine was first programmed using the IBM704 computer to evaluate H(t) to 
at least seven significant figures by storing a table of H(t) values in memory for 
t = 0(0-2) 3 (0-4) 7 (1) 9, then using the first six terms of the Taylor series expansion of E(t) 
about the nearest value of ¢ in the stored table. Two routines, herein referred to as routine 1 
and routine 2, were then programmed for the IBM 704, both making use of the E(t) sub- 


* See Editorial Note on p. 462 below. 
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routine already mentioned. Routine 1 was used to solve for the percentage points and routine 
2 was used for checking purposes. Routine 1 computed P,(t) from (3) for ¢ = 0(0-125)8 
and n = 2(1) 20, these values being stored on magnetic tape for use in (1). The values of q 
which made F(q) = 1—a were found to three significant figures by iteration using linear 
interpolation in g. The integral in (3) was computed by the Euler—Maclaurin integration 
formula with w = $¢(0-125) 8 using terms up to and including the first derivative of the 
integrand at the lower limit; the derivative at the upper limit was taken to be zero. The 
integral in (1) was computed using the trapezoid rule with ¢ = 0 (0-125) 8. The values of q 
when n = 2 and v = 1, 2 were found explicitly using (4). For v = 00, the values of q were 
computed ab initio using iteration and linear interpolation to solve the equation P,(q) = 1—a, 
where P,(¢) was computed from (2) using the rectangular rule with wu = — 4-5 (0-125) 4-5. 
Starting values were found for the first column of the table (i.e. m = 2) by using a crude 
interpolation between the values found at v = 1, 2 and v = oo. The solution of a given case 
served as the initial guess for the following case. 


4. METHOD OF CHECKING 


The results of routine 1 were checked by substituting them into routine 2 which computed 
P,(t) from (2) using the rectangular rule with w = — 6 (0-375) 6. Routine 2 evaluated the 
integral in (1) using the Euler-Maclaurin integration formula with terms up to and including 
the first derivative of the integrand at the lower limit of integration, the value at the upper 
limit being taken to be zero, the integration being stopped when the value of the integrand 
became less than 10-°. The checking process resulted in altering only the first few values 
on the v = 2 row. 


The writer wishes to express his sincere appreciation to Lois Matsunaga for her patient 
and skilful programming of the routines needed to produce the final results. Thanks are also 
due to Prof. Henry Scheffé for indicating the need for the upper 10 % points of the ‘Student- 
ized’ range. 
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CORRECTED TABLES OF UPPER 5°% AND 1% POINTS OF THE STUDENTIZED RANGE 
Editorial note 

A short explanation of the basis of the accompanying tables of 10%, 5% and 1% points of the 
studentized range seems called for. In publishing the table of 5 % and 1 % points of this ratio, based 
on Miss Joyce May’s calculations, Prof. Hartley and I (Biometrika Tables for Statisticians, 1, 1954, 
table 29, and introduction, p. 52) recognized that a certain number of last figure errors were likely 
to be present, which we hoped would not exceed one unit. In his calculation of the 10% points 
described in the preceding paper Dr James Pachares recomputed the 5 % values for checking purposes 
and was thus able to identify about 40 unit errors in our last figure. 

After receiving Dr Pachares’s paper, we learned that some very extensive computations on the 
distribution of range and studentized range were being carried out under the direction of Dr Leon 
Harter in the Aeronautical Research Laboratory of the Wright Air Development Center, Ohio. This 
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work is being published as a W.A.D.C. Technical Report (no. 58-484, vols. 1 and 11). Since this pro- 
gramme included the calculation of 10 %, 5 % and 1 % points of the studentized range to high accuracy, 
the editors of Biometrika felt that it would be useful to take this opportunity of printing, with Dr 
Pachares’s paper, corrected tables of both the upper 5% and 1% points, given throughout to two 
decimal places. 

We are most grateful to Dr Harter for allowing Biometrika to make use of his 1 % results in advance 
of publication. This has made it possible to identify some 30 last figure unit errors in Miss May’s 
original 1% table. Further, in the tables at all three levels it has been possible to include a second 
decimal for low values of v, where both Miss May and Dr Pachares had rounded off to one decimal 
place only*. Finally, Dr Harter and his collaborators have gone to the trouble of recomputing certain 
values with a smaller tolerance where a 5 in the final figure of their fuller tables left uncertainty in the 
rounding off to two decimal places. 

An interesting point which appeared in a comparison of the Pachares and Harter rules for deter- 
mining the final figure in rounding off for the 100 P% point, was that under the former rule that 
value of x is taken which minimizes |F(x)— P|, i.e. gives a probability integral nearest to the required 
value of P. On the other hand the Harter rule takes for x that value which is nearest to the value 
of X for which F(X)=P exactly. These rules almost invariably lead to the same answer, since F(z) 
will increase nearly linearly for small increments of 0-01 in x. However, in one case at least the rules 
do lead to different conclusions: for the 10% point, n= 12, v= 16 the Pachares rule gives 4-81 and the 
Harter rule 4:80; the former value has been inserted in Table 1. E. S. P. 


J. PACHARES 


* Dr Harter has asked me to make clear, however, that where the tabular value lies between 10 
and 100 and two decimals are recorded he cannot guarantee that the second decimal place is not in 
error by one unit. 











Table 1. Upper 10% points of the studentized range 



































| | | 
2 3 | * ls ee ae 8 9 10 
ma” | | | 
| | | | 
| | | | 
1 | 8-93 | 13-44 | 1636 | 1849 | 20-15 | 21-51 | 22-64 | 23-62 | 24-48 
2 | 413 573 | 677 7-54 | 814 | 8-63 9-05 9-41 9-72 
3 | 333 | 447 | 520 | 574 | 616 | 651 | 681 | 7:06 | 7-29 
4 3-01 | 3:98 | 4-59 503 | 539 | 5-68 | 5-93 6-14 6-33 
5 | 2-85 372 | 426 | 466 | 4-98 | 524 5-46 5-65 5-82 
6 2-75 3-56 | 4:07 4-44 4-73 | 4-97 | 5-17 5-34 5-50 
7 2-68 3-45 | 3-93 4-28 4:55 | 478 | 4-97 5-14 5-28 
8 2-63 337 | 383 | 4:17 4-43 465 | 4-83 4-99 5-13 
9 2:59 | 332 | 3-76 4-08 | 434 | 4:54 | 4-72 4-87 5-01 
10 2-56 | 3:27 | 3-70 402 | 4-26 | 447 | 4-64 4-78 4-91 
11 2:54 | 3:23 | 3-66 3:96 | 4-20 | 440 | 45 4-71 4-84 
12 2-52 3-20 | 362 | 392 | 416 | 435 | 4651 4-65 4-78 
13 2-50 318 | 359 | 388 | 412 4-30 4:46 | 4-60 4-72 
14 2-49 316 | 3:56 | 3-85 | 4-08 4-27 | 442 | 4-56 4-68 
15 2-48 S14 | 354 | 3:83 | 4-05 423 | 4:39 | 4-52 4-64 
16 2-47 312 | 352 | 380 | 403 | 421 | 436 | 4-49 | 4-61 
17 | 246 | 311 | 3:50 | 3-78 4:00 | 418 4-33 446 | 4-58 
18 | 245 | 310 | 3-49 3-77 398 | 416 4-31 4:44 | 4-55 
19 | 245 | 309 | 3-47 3-75 3-97 4-14 4:29 | 4-42 | 4-53 
20 244 | 3-08 3-46 3-74 3-95 4-12 4-27 440 | 451 
24 | 2-42 | 3-05 3-42 3-69 | 390 | 407 4:21 | 4:34 | 4-44 
30 | 240 | 3-02 3-39 3-65 | 385 | 4-02 416 | 4:28 | 4-38 
40 | 238 | 299 3-35 39 | 3-80 3-96 410 | 4-21 4-32 
60 | 2-36 2-96 3-31 3-56 375 | 3-91 4-04 | 4:16 4-25 
120 | 2-34 2-93 | 3-28 3-52 3-71 | 3-86 3-99 | 4-10 4-19 
oO | 233 | 2-90 3-24 348 | 3:66 | 3-81 3-93 4-04 4:13 
| | | | 
. | 
11 12 Ss | Ss 15 16 17 18 19 20 
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Table 2. Upper 5% points of the studentized range 
























































n 
2 3 + 5 6 7 8 9 10 
v 
| | 
1 | 17-97 | 26-98 | 32-82 | 37-08 | 40-41 | 43:12 | 45-40 | 47-36 | 49-07 
| 2 | 6-08 8-33 9:80 | 10:88 | 11-74 | 12-44 | 13:03 | 13:54 | 13-99 
| 3 | 4:50 5-91 6-82 7-50 8-04 | 8-48 8-85 9-18 9-46 
+ 3-93 5-04 5-76 6-29 6-71 7-05 7:35 7-60 7-83 
| 5 3-64 4-60 5-22 5-67 6-03 6-33 6-58 6-80 6-99 
| 6 3-46 4-34 4-90 5-30 5-63 5-90 6-12 6-32 | 6-49 
ee 3-34 4-16 4-68 5-06 5:36 | 5:61 5-82 6-00 | 6-16 
| 8 | 326 | 404 4-53 4-89 5:17 | 5-40 5-60 577 | 5-92 
| 9 | 320 | 395 | 4-41 4-76 5-02 5-24 5-43 5:59 | 574 
| 10 3-15 3-88 4-33 4-65 491 | 512 5-30 5-46 | 5-60 
| il 3-11 3-82 4-26 4:57 4:82 5-08 5-20 535 | 5-49 
12 3-08 3-77 4-20 4-51 475 | 495 | 5-12 5-27 5-39 
| 13 3-06 3-73 4-15 4-45 4-69 4:88 | 5:05 519 | 5:32 
| 14 | 3-03 3-70 4-11 4-41 464 | 483 | 4-99 5:13 | 5-25 
| 15 | 3-01 3-67 4-08 4-37 4-59 4-78 4-94 5-08 | 5-20 
| 16 3-00 3-65 4-05 4-33 4-56 4-74 4-90 503 | 5:15 
| 17 2-98 3-63 4-02 4-30 4:52 | 4-70 4-86 499 | 511 
18 2-97 3-61 4-00 4-28 449 | 4-67 4-82 4-96 5-07 
19 2-96 3-59 3-98 4-25 4-47 | 4-65 4-79 4-92 5-04 
20 2-95 3-58 3-96 4-23 445 | 4-62 4:77 4-90 5-01 
24 2-92 3-53 3-90 4-17 437 | 4-54 4-68 4-81 4-92 
30 2-89 3-49 3-85 410 | 430 | 4-46 | 4-60 4-72 4-82 
40 2-86 3-44 | 3-79 404 | 4:23 | 4-39 4-52 4-63 4-73 
60 2-83 3-40 | 3-74 3-98 416 | 431 4-44 4-55 4-65 
120 2-80 3-36 | 3-68 3-92 4-10 4-24 4-36 4-47 4-56 
00 2:77 3-31 3-63 3-86 4-03 417 | 4-29 4-39 4-47 
| 
AS | | 
11 12 13 14 15 | 16 17 18 19 20 
! 
|, | 
| = 
| | | 
| 1 | 50-59 | 51-96 | 53-20 | 5433 | 55-36 | 56-32 | 57:22 | 58:04 | 5883 | 59-56 
| 2 | 1439 | 1475 | 15:08 | 15:38 | 15°65 | 15-91 16-14 | 16-37 16-57 | 16-77 
| 3 | 972 | 995 | 1015 | 1035 | 10-52 | 10°69 | 10-84 | 10-98 | 11-11 | 11-24 
| 4 | 803 | 821 8-37 8-52 8-66 | 8-79 8-91 9-03 | 913 9-23 
| 5 | 717 | 7:32 | 7-47 760 | 7-72 | 7-88 7-93 8-03 | 812 8-21 
| 6 | 665 | 679 | 6-92 703 | 714 | 7-24 7-34 7-43 | 7-51 7-59 
| 7 | 630 | 643 | 6-55 6-66 | 676 | 685 6-94 7-02 | 7:10 717 
| 8 | 605 | 618 6-29 6-39 6-48 6-57 6-65 6-73 | 6-80 6-87 
re 5:87 | 5-98 6-09 6-19 | 6-28 | 6-36 6-44 651 | 6-58 6-64 
| 10 5-72 5:83 5-93 603 | 611 6-19 6:27 634 | 6-40 6-47 



































Table 3. Upper 1% points of the studentized range 


















































































n | 
2 3 4 5 6 7 8 9 10 
= 
| 
1 | 90-03 | 135-0 | 164-3 | 185-6 | 2022 | 215-8 | 227-2 | 237-0 | 245-6 
2 | 14:04 | 19:02 | 22-29 | 24-72 | 26-63 | 28-20 | 29-53 | 30-68 | 31-69 
3 | 826 | 1062 | 1217 | 1333 | 1424 | 15:00 | 15:64 | 16-20 | 16-69 
4 | 651 8-12 9-17 9-96 | 10:58 | 11-10 | 11:55 | 11-93 | 12-27 
5 | 570 6-98 7-80 8-42 8-91 9-32 9-67 9-97 | 10-24 
6 5-24 6-33 7-03 7-56 7-97 8-32 8-61 8-87 9-10 
7 | 4-95 592 | 6-54 7-01 7-37 7-68 7-94 8-17 8-37 
8 | 475 5-64 6-20 6-62 6-96 7-24 7-47 7-68: 7-86 
9 4-60 5-43 5-96 6-35 | 6-66 6-91 7-13 7-33 7-49 
10 4-48 5:27 5-77 614 | 6-43 6-67 6-87 7-05 7-21 
11 4-39 515 | 5-62 5-97 | 6-25 6-48 6-67 6-84 6-99 
12 | 4-32 5-05 5-50 5-84 6-10 6-32 6-51 6-67 6-81 
13 4-26 4:96 | 5-40 573 | 5:98 6-19 6-37 6-53 6-67 
14 4-21 4:89 | 5-32 5-63 | 5-88 6-08 6-26 6-41 6-54 
15 | 417 4:84 | 525 | 5-56 5-80 5-99 6-16 6-31 6-44 
16 | 413 | 479 | 519 549 | 5-72 5-92 6-08 6-22 6-35 
17 410 | 474 | 5-14 543 | 5-66 5:85 6-01 6-15 6-27 
18 407 | 470 | 5-09 5:38 | 5-60 5-79 5-94 6-08 6-20 
19 4-05 467 | 5-05 5:33 | 5-55 5-73 5-89 6-02 6-14 
20 4-02 4-64 5-02 529 | 551 5-69 5-84 5-97 6-09 
24 3-96 4:55 491 | 5-17 | 5°37 5-54 5-69 5°81 5-92 
30 3-89 4-45 480 | 5:05 | 5-24 5-40 5-54 5-65 5-76 
40 3-82 4-37 4-70 493 | 511 5-26 5-39 5:50 5-60 
60 3-76 4:28 | 4-59 4:82 | 4-99 5-13 5-25 5:36 5-45 
120 3-70 420 | 4-50 4-71 | 4-87 5-01 5-12 5-21 5:30 
oo 3-64 4-12 4-40 4-60 | 4-76 4-88 4-99 5-08 5-16 
| | 
n | | | | 
|; wm | a | 2 4 | 15 16 17 18 19 
| | | 
1 | 253-2 | 260-0 | 266-2 | 271-8 | 277-0 | 281-8 | 286-3 | 290-4 | 2943 
2 | 3259 | 33-40 | 34:13 | 34-81 | 35-43 | 36-00 | 3653 | 37-03 | 37-50 
3 17-13 17-538 | (17-89 | 1822 | 1852 | 1881 | 19-07 | 19-32 | 19-55 
4 1257 1284 | 13:09 | 13-32 | 1353 | 13-73 | 13-91 | 1408 | 14-24 
5 | 1048 10-70 | 1089 | 11-08 | 11-24 | 11-40 | 11-55 | 11-68 | 11-81 
6 | 930 | 9-48 9-65 9-81 995 | 10-08 | 10-21 10-32 | 10-43 
7 855 | 871 8-86 9-00 9-12 9-24 9-35 9-46 9-55 
8 8-03 | 8-18 8-31 8-44 8-55 8-66 8-76 8-85 8-94 
9 765 | 7-78 7-91 8-03 8-13 8-23 8-33 8-41 8-49 
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ON THE DISTRIBUTION OF THE EXTREME STUDENTIZED 
DEVIATE FROM THE SAMPLE MEAN 


By K. C. 8. PILLAI* anp BENJAMIN P. TIENZO+ 


The Statistical Center, University of the Philippines and the Philippine 
Statistical Survey of Households, National Economic Council 


1. INTRODUCTION AND SUMMARY 


The distribution of the extreme deviate from the mean of a sample taken from a normal 
population and that of the studentized extreme deviate from the sample mean are studied 
and percentage points are obtained for the latter distribution. This paper deals only with 
small sample sizes. 

Let 7, < % < ... < x, be an ordered sample of size n from a normal population, Z the mean 
of the sample and s, the square root of an independent mean square estimate of o? based on v 
degrees of freedom, where a” is the square of the standard deviation of the normal population. 
Then, the distributions considered are those of: 

u = (,—2)/o or (®—2,)/o (1) 
and t, = (x,—2Z)/s, or (%—2,)/s,. (2) 

McKay (1935) derived the distribution of wu and indicated its usefulness in testing outlying 
observations. Nair (1948) made further studies of the distribution and tabulated the 
probability integral of u for sample sizes varying from 3 to 9. Grubbs (1950) also studied the 
same distribution problem independently and obtained tables of the probability integral 
of w for sample sizes ranging from 2 to 25. 

The distribution of u as given by McKay (1935) follows the recurrence formula 


where, following Nair (1948), F,,(w), the cumulative distribution function of u, can be given 
in the form jn 
F,(u) = (V(2m)"3 Gys(nu) 


P 7 {2 
md where (2) = [exp [-|4, dt (G(x) = 0). 


Nair (1948), following Hartley’s expansion (1944) for what he termed ‘studentized 
integrals’, obtained percentage points of the distribution of t,, for sample sizes n = 3(1)9 
and degrees of freedom v = 10(1)20, 24, 30, 40, 60, 120, co. David (1956) showed that 
the studentization procedure employed by Nair tended to become unsatisfactory for 
v< 20. He provided amended figures, using the same values of v and adding results for 
n= 10, 12. 

Tabulation for vy < 10 was not, however, attempted by previous authors and we shall 
here make a study of the distribution of w,, and also ¢,, for n = 3, 4, 5, and y < 10. The 

* United Nations Senior Adviser in Mathematical Statistics and Visiting Professor of Statistics, 


Statistical Center, University of the Philippines. Now with United Nations, New York. 
t Now Senior Statistician, Bureau of the Census and Statistics, Manila. 
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percentage points for v < 10 are particularly useful in many situations, for example, 
in experiments with small numbers of treatments or small numbers of blocks. The 
following sections present the study of the distribution problem and an application to 
testing outlying observations. The percentage points derived from the formulae given in 
§4 have been incorporated in the somewhat more extensive tables covering sample sizes 
up to x = 12 given below in the paper by Pillai (1959, p. 473). 


2. DEVELOPMENT OF CERTAIN SERIES EXPANSIONS 
Pillai (1950) has developed an expansion of the normal probability integral in the form 


w 
| e-8 dt = wet” (l+a,wt+...+a,w%+...), (4) 
0 


where a; follows the recurrence relation 
3(21+1)a,—a, = (—1)'/[3 T+). (5 
The series (4) has been shown to be absolutely convergent by Pillai & Ramachandran (1954). 


3a//2 
Now substituting for w, (3/,/2)a in (4) we get an expansion of the integral | et dt. 
0 


If we denote the resulting series by 


oe et dt = 7" e-4*(1+b,a4+...4+b,0%+...), (6) 

then (2¢+1)b, = $6,_.,+(-3)¢/i!. (7) 
If i is large it has been shown by Pillai & Ramachandran (1954) that 
—4a;|a;_, < 11/301 and hence —b,/b;_, < 99/601, 


which shows that series (6) is absolutely convergent. The values of the first few 6 coefficients 
are given below: , 


b, = 0-225, b; = —0-003,287,337,7, 
b, = —0-032,142,857, bg = 0-000,837,638,92. 
b, = 0-018,080,357, 


3. THE DISTRIBUTION OF THE EXTREME DEVIATE FROM THE SAMPLE MEAN 


In this section the distribution of the extreme deviate from the mean of a sample from a 
normal population is developed in series form for n = 3, 4 and 5. 
(i) n = 3. From (3), f,(w) can be obtained by putting m = 3 and is given by 


3/3 — Put » —2z2 
fx(u) = —« i I e~** dz. (8) 


Transforming z = t/,/2 and using the series (6) in (8) we get 


f,(w) = °3 e~8* (wu +b,ud +... +b, u%+1+...). 
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In f,(wv), if we retain terms up to and including b;w!! and approximate the remaining 
terms in an exponent, we get 


f,(u) = -_ e2?(u +b, u5 + bu? + byw? +b, ut eber*bs), (10) 


The c.p.F. of u can be obtained from (10) as a series of incomplete gamma functions, and is 
given by 


RU) = [3 Ib(G)!" yoald + 1) PG-+ 1) + Bor *Loxl6) 716), (11) 
where r=2-5b,/b; and L[,(p)= I. e-” w? dw/T(p+1). 
0 


If we make U tend to infinity in (11) we get the total probability. The total probability 
attained for F,(0o) in (11) is 0-999,262. For the 95 and 99% probability, Grubbs (1950) 
gives U = 1-738 and 2-215, respectively, which when substituted in (11) give probabilities 
of 0-950,036 and 0-989,882; hence the expression (11) was considered to give sufficient 
accuracy for obtaining the upper percentage points. 

(ii) n = 4. To obtain f,(w) we substitute n = 4 in (3) and use (11). We get 

fal) = oe Fg) (12) 
on" Gee 
Integrating f,(w) in (10) by parts in the interval 0 to 4u and substituting in (12) we arrive 
at the form 


flu) = Afe#* [b + e-3"(cy +0, U2 +... + egu®)] + 8 (eg +e, U2 +... +e5u2)}, (13) 


where A = 4:571,541,8, 6 = 0-402,768,92, d = —3-786,324,8 and 


a C; e, 

0 — 0:409,523,81 0-006,754,891,7 
1 — 0-203,174,60 0-021,072,953 
2 — 0-270,899,47 0-032,807,204 
3 — 0-930,099,941 0-034,181,266 
4 — 0-060,199,882 0-026,658,466 
5 0-016,633,060 


Integrating (13) in the interval 0 to U we get F,(U) which could be shown to reduce to the 
following form involving incomplete gamma functions: 


5 6 
FU) = BIgys(3) (3) +E & fhovralt —3)TP-3)+D x GiTay(i—4)T(@—4), (14) 


where B = 1-127,546,0, C = 1-523,847,3, D = —0-002,141,004,0 and 


i Si 9 

1 — 0-336,458,14 — 3-706,164,5 

2 — 0-050,077,491 — 3-053,611,7 

3 — 0-020,030,996 — 1-257,977,7 

4 — 0-000,667,699,88 — 0°345,494,26 
5 — 0-000,400,619,93 — 0-071,165,578 
6 — 0-011,727,059 
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The total probability obtained from F,(0o) is 1-000,061, and for the 95 and 99 % points 
given by Grubbs (1950), we get the values 0-950,062 and 0-990,063. 
(iii) » = 5. From (3), putting n = 5, we get 


feu) = Saye Flt. (15) 


Integrating /,(u) in (13) in the interval 0 to 2u, we arrive at f,(u) in the following form: 
fs(u) = Ble [how +h, u + hau? +hzu"] 


2r- . . . ° 2 fu 2 
+e [ju +i,ue+i,u®+igu? +i,u9] + Feu | e- 8" dw 
0 
fu i 
+ Get | es" dw+H ew | 
0 


"ete du] : (16) 
0 


where EH = 10-195,247, F = 0-402,768,92, G = —0-463,009,52, H = 0-013,920,354, 
a = —6-541,132,5 and 


i hy i, 

0 0-066,857,143 — 0-008,956,828 
1 0-099,867,725 —0-021,607,151 
2 0-042,713,845 — 0-031,069,928 
3 0-043,058,311 —0-029,234,091 
4 —0-016,364,940 


4. THE DISTRIBUTION OF THE EXTREME STUDENTIZED DEVIATE FROM THE SAMPLE MEAN 


The distribution of t,, = (x, —%)/s, or (G—2,)/s, could be obtained by using f,,(w) derived 
in the previous section and f(s,) given by 


f(6,) = [2v)# Ty] et | (11) 
Since s, is independent of uw, the simultaneous distribution of w and s, is given by 
F(U, 8) = fr(Mf(s,)- (18) 


In (18), if we use the transformation 
t,=ujs, and s,=8, (19) 


and integrate s, in the interval 0 to oo we get f(t,,). Let us study the distribution of t,, t, 
and t, separately. 

(i) n = 3. Multiplying (10) by (17), applying the transformation (19) with n = 3 and 
integrating s, from 0 to oo we get f(t,). f(t;) can be shown to be of the form 


fits) = (18,13 oy} /2n PMH (gee y"ree) 


3+v 2 
4b, t3i+1 2 \btiH lp . 
ih a eee i 1 
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Terms involved in f(t,) can be reduced to #-functions since t, varies from 0 to oo. Hence 
the o.D.F. of t, can be given in the form 


4 b,T(gv+itl) 2 


24), T(Av +6 
a) = RLS Sear “ar 


LT (¥y) 





oOo. B 
61-0) dO + I 651-0)” ao}, 
0 0 


(21) 
where K = 4-961,960,1, L = 1,868-775,6, 


a = 312/(v+302), 2 = 3-509,615,4t2/(v + 3-509,615, 412). 


In arriving at (21) the transformation used for all terms except the last one in (20) is 0 = a, 
and in the last term of (20) is 0 = f. 

Alternatively, for odd values of v, it is possible to obtain the distribution of t, separately 
for each v in a convenient form involving f#-functions. The distribution obtained in this 
manner is exact. For example, when v = 1, multiplying f,(w) and f(s,), using trans- 
formation (19) with v = 1 and integrating with respect to s, by parts, we get 


Srmrlts) = [(9 /3)/a7] [t/(30 + 2) (6? + 1)¥] (22) 
60? /(1 +62") 
and F,-i(ts) = [(3 /3)/(87)] [, (1—)-4 (1—30/4)-1d0. (23) 


The other cases also could be attempted in a similar manner. 
(ii) n = 4. Similarly, using f,(w) given in (13) and f(s,) in (17), we get 


f(t) = 44 [oaytray+ & C,(;8 82D DP (7 te a 3 = e,(2/7-572,649 eorn (4) _— 


(24) 
e+” 94, aa 
and F(t,) = Mb— ra “6410-140 
-aytesry PUR + 26+ D} (™ pyeen (y _ pybo- 

+N ¥ (0 3)8@ +1) a. a f phe D1 6) 1d0 

. poss PE (V + 204+ 1} [?" pre. « 

( —q)-teitp LN 4(2t-1) (] —_ 9\}-1 
+N & e(—d)tonn f. gk2i—-D (1 9)b-=1d8, (25) 


where M = 2-799,486,2, N = 2-285,770,9, =m’ = 4¢3/(3v + 4#2), 
n’ = 2083/(3v+ 2082),  p’ = 7-572,649,6t2/(v + 7-572,649, 682). 


(iii) x = 5. The method used for this case is similar to that adopted for f,_,(t,). f(t;) for 
v = 1,3,5,7,9 and 11 have been evaluated separately as well as the corresponding C.D.F.’s. 
The expressions for the c.D.F. are given below for the different cases. 


i(»—1) ky 
Fits) = “ =a (1—@)t-2 (1 — 56/8)“ do 


= ) 


‘. (1—@)8e-2 (1 — 250/28)-@+» do 


i) . 
+z Uy "(1 —@)ke-® (1 — 0-904,450,800)-+ d0 
i=0 


3 4 My 
+> ar (1-0) 9'd0+ > Wm | (1—0)8-29:d0 (vy =1,3,5,7,9,11). (26) 
i=0 i=0 0 


The values of ,S;, ,7;,, ,U;, k,, l,, m,, ,V; and ,W,, are given in Tienzo (1958). 


a? p* io pir “ys 
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5. PERCENTAGE POINTS FOR THE EXTREME STUDENTIZED DEVIATE FROM THE SAMPLE MEAN 


The percentage points for n = 3, 4 and 5 were computed from the formulae for F(t,,) given in 
the previous section. For n = 5 and even degrees of freedom, the values were obtained by 
interpolation. For n= 2 the values were computed by dividing by 2 corresponding 
percentage points of the studentized range (Pillai, 1957). All these values have been in- 
corporated in the more extensive table of 5 and 1% points for sample sizes up to n = 12, 
printed on p. 473. 


6. APPLICATION OF THE TEST 


Cochran & Cox (§ 4:23, 1950) have given an example of a randomized block experiment 
carried out by the North Carolina Agricultural Experiment Station in 1944. The experi- 
ment tested the effect of five levels of application of potash, supplying respectively 36, 54, 
72, 108 and 1441b. K,O per acre, on the yield and properties of cotton. The experiment was 
arranged in three randomized blocks of five plots each. The means for the five levels were 
respectively 7-85, 8-05, 7-74, 7-51 and 7-45 with general mean 7-72. 


Analysis of variance 


Source of variation D.F. S.s. M.S. F 
Replications 2 0-097,1 — = 
Treatment 4 0-732,4 0-183,1 4-19 


(Significant at 95 % level 
but not significant at 
the 99 % level) 

Error 8 0-349,5 0-043,7 — 


Total 14 1-179,0 a= —— 


For testing the difference of the largest mean from the general mean, we computet; = 2-73. 
Referring to Table 1 on p. 473, we see that this value is significant at the 95% but not 
significant at the 99 % level. 

Considering the smallest mean, the corresponding value of t; is 2-24. Referring to the 
same Table, it is found not to be significant at the 95 % level. 


The authors wish to acknowledge the facilities offered by the Statistical Center, Univer- 
sity of the Philippines, in the preparation of this paper. 
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Upper percentage points of the extreme studentized deviate from the sample mean 


By K. C. 8. PILLAI* 
The Statistical Center, University of the Philippines 


Let 7, < x, < ... < #, be an ordered sample of size n from a normal population, %, the mean of the 
sample and s, the square root of an independent mean square estimate of o? based on v degrees of freedom, 
where a? is the square of the standard deviation of the normal population. Then the extreme studentized 
deviate from the sample mean is defined as t,, = («,—%)/s, or (G—2,)/s,. Table 1 gives the upper 5 and 
1 % points of ¢,, for sample sizes ranging from 2 to 12 and degrees of freedom from 1 to 10, thus filling the 
gap that existed in the tables of upper percentage points of t,.T 


Table 1. Upper percentage points of the extreme studentized deviate from the sample mean 
(x, —%)/s, or (¥—2,)/s, 

















| | | 
. | | | 
° oe 4 5 6 7 8 9 10 12 
ee | 
5 % points 

1 9-0 13-5 16-4 19 20 22 23 24 25 | 26 
2 3°04 4:23 4-98 5-5 6-0 6-3 6-6 6-9 Tl | 7-5 
3 2-25 3-03 3-50 3°88 4-15 4-36 4:55 4-72 4-86 5-11 
4 1-96 2-58 2-98 3-26 3-48 3°65 3-80 3-93 4-05 4:24 
5 1-82 2-37 2-71 2-95 3°15 3-30 3-43 3-54 3-64 3-80 


| 1-73 2-24 2-55 2-78 2-95 3-09 3°21 3°31 3°39 3°54 
| 1-67 2-15 2-45 2-66 2-82 2-95 3-06 3°15 3-23 3°37 
2-09 2-37 2-57 2-72 2-85 2-95 3-04 3-12 3°25 
1-60 2-04 2-32 | 2-51 2-65 2-78 2-87 2-96 3°03 3°15 
1-58 2-01 2-27 2-46 2-60 2-72 2-81 2-89 2-96 3-08 


owuonn 
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fon) 
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2 7:0 9-9 11-3 12-6 13-6 14-4 15-0 15-6 16-1 16-9 
3 4-13 5:5 6-3 6-9 7:3 7-7 8-1 8-4 8-6 9-0 
4 3°26 4-23 4-81 5-23 554 | 5-80 6-03 6-22 6-39 6-68 
5 2-85 3°65 4-11 4-45 470 | 4-93 5-11 5-26 5-39 5-62 
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* United Nations Senior Adviser in Mathematical Statistics and Visiting Professor of Statistics now 
with the United Nations, New York. 


t The columns for n = 2 and the rows for v = 10 are not new but are included for the convenience of 
the table user. 
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Table 1 was prepared in three steps. First, a table of trial values was obtained by taking one-half of 
the corresponding percentage points of the studentized range, q = (x, — %,)/s, from Pillai’s Tables (1957), 
and adjusting these values after noting the differences between them and the percentage points of t, 
given by David (1956) and Pillai & Tienzo (1959). Secondly, based on the table of trial values and 
employing David’s method (1956) of solving the equation 


wo Un|Ra 
I. St (Un) I F (sy) ds, du, =a, (1) 


(where R, is the upper 100a percentage point of the extreme studentized deviate and u,, the extreme 
deviate from the sample mean) percentage points of t, were obtained for v = 1, 2, 4, 6 and 8 and values 
of n starting from 6. Since the integral for s, is exactly evaluable for even degrees of freedom, equation 
(1) was solved for R, by numerical integration using Grubbs’ tables (1950) of the cumulative distribution 
function of u,. For v = 1 both Grubbs’ tables and normal probability tables were used. Lastly, per- 
centage points for v = 3, 5, 7 and 9 and n ranging from 6 to 12 were obtained through adjustment of 
the trial values by the average of the differences observed for the two neighbouring values from the 
corresponding trial values for even degrees of freedom and the same value of n. 


T am grateful to Professor E.S. Pearson for his suggestion of filling the gap between the table of David 
and that of Pillai & Tienzo. 

My thanks are due to Miss Aurora Abesamis for the computational assistance in preparing Table 1, 
T also wish to acknowledge the facilities offered by the Statistical Center in the preparation of this paper. 
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The asymptotic efficiency of the y?-test for a balanced incomplete block design* 


By Px. van ELTEREN anv G. E. NOETHER{ 
Mathematical Centre, Amsterdam, and Boston University 


Friedman (1937) has shown how n treatments can be compared on the basis of the rankings of m ‘ob- 
servers’. This method has been extended by Durbin (1951) to cover a balanced incomplete block design, 
i.e. the case when each observer ranks k < n treatments exactly once and each treatment is compared 
with any other treatment exactly A times. We want to find the asymptotic (m — 00) relative efficiency 
(in the sense of Pitman, see e.g. Hannan, 1956) of Durbin’s test with respect to the usual analysis of 
variance test for a balanced incomplete block design. 
We note for future reference that 
\k-1) 


n—-1 


A= 





, (1) 


where | is the total number of times a given treatment is used (replications). 

The asymptotic relative efficiency is most easily obtained with the help of a formula given explicitly 
for the first time by Hannan (1956). It may be stated roughly in the following way. If both test statistics 
have, under the alternative hypothesis, non-central y-square distributions with the same number of 
degrees of freedom, the asymptotic relative efficiency of one test with respect to the other test is equal 
to the ratio of the two non-centrality factors after the alternatives have been set equal. 

Essentially, then, all we have to do is to compute the two non-centrality factors which we shall denote 
by d? and dj. for the rank and F-tests, respectively. The conditions for the applicability of Hannan’s 
formula can be shown to hold provided the underlying distributions satisfy some very general regularity 
conditions. These details will not be presented in this paper, since similar considerations have been given 
in several other papers, e.g. Andrews (1954), Benard & van Elteren (1953) and Bradley (1955). 

Before computing the non-centrality factors, we have to specify the mathematical model which we 
are going to consider. The usual analysis of variance model suggests the following approach. 

Let F(x) be a continuous cumulative distribution function with density function f(x) = F’(x). Let 
X,, (4 = 1,2,...,m; v = 1, 2,...,)f be the chance variable associated with the observation of the wth 
observer (block, in the analysis of variance) on the vth treatment. It is then assumed that the dis- 
tribution f,,(x) of x,, is given by F,,(x) = F(e+0,+1,), 
where, without loss of generality, we may assume that >) @, = 0. The null hypothesis to be tested is that 


v 
6,= 0, =... = 0,, = 0. The alternatives to be considered specify that for a given number of replica- 
tions J, é 
6, = Ay, = +, 
al 


where the 6, are given constants satisfying 





> 6, = 0. (2) 
v 
For this model, we find (e.g. Anderson & Bancroft, 1952, §19-3) that 
d= IC > 63,/o? = Cd &/0%, (3) 
v v 
where o? is the variance associated with F(x) and 

nk—-1 

_ n(k-1) 1. 
k(n —1) 


is the efficiency factor of the given incomplete block design. The fact that for large m, F has approximately 
anon-central y-square distribution follows from an argument similar to that given in Andrews (1954). 

For computing d?, let us introduce the following notation. Let r,, stand for the rank which the wth 
observer assigns to the vth treatment (assuming that he considers the vth treatment), and let R, = Thy 


Bh 
* Report SP 63 of the Mathematical Centre, Amsterdam. 


+ Work supported by the Office of Naval Research. 
{ Here and below we use bold type to indicate chance variables. 
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where > denotes summation over those J observers who have rated tre2iment v. If we then set 


7 
u, = R,—4U(k+1), Durbin’s test statistic is, except for a constant factor given by 


ee eee 
Xr nA(k+1) 4” 


which, for large 1, has approximately a non-central andi distribution with (n — 1) degrees of freedom, 


We then have 


Now é&(R) =>? D> Py > Xyxpi tl (5) 
B ip)Fi 
where j() refers to the treatments rated by the th observer. Since 


d; = 


PE yi > Xpxp} = J Fle +04, )fle+Oe+7,)de 
8xy) a 


F(x +0%)—9 dx=|F 8; be 1 % 
=| (e+ Ox)— 9;) f(x) z= (+ vi ) fe oat 


and since every treatment occurs exactly A times together with every other treatment, (5), in view of 
(1) and (2), becomes 


é(R,) ~ ry [port | eae] +0 = =r bs | Paya. 
I= 
j+i 














Jl 2 
na 
Thus (u,) ~- "7 | prayae, 
12n(k—1) 
d, fi 4 2 2 2 
and, finally d; ~ &+Din— awe Ff, (a)de |S Die, (6) 
From (3), (4) and (6), the asymptotic relative efficiency EZ, of the y?-test with respect to the F-test is 
fgund to be 12k 
t 2 7 
E, = =75/¢ [rerae’. (7) 


It is interesting to note that (7) depends only un the block size k, and not on the number of treatments. 
We obtain the relative efficiency of the Fried:nan test by replacing k by n in (7). 
In the special case when f(x) is normal, (7) becomes 
3k 
BID ot oer, 1N 
1M) = ep (7) 
some values of which are tabulated below: 
k 2 3 4 5 6 7 8 9 10 17 oo 
E,(N) 0-64 0:72 0-76 0-80 0-82 0-84 0-85 0-86 0-87 >0-90 0-95 
When k = 2, we have the case of paired comparisons where each observer is asked to compare only 
two treatments. In this case, (7) becomes 


E,= s[ o [rear], (8) 


and (7N), E,(N) = 2 = 0-637. (8) 


The problem of paired comparisons has been of considerable interest during the last few years and 
many methods for treating the problem have been offered. It is easy to show that the x?-test for k = 2 
is equal to the asymptotic form of the Bradley—Terry test discussed by Bradley (1955). 

Bradley finds n/{7(n — 1)} as the asymptotic efficiency of his 7'-test relative to the analysis of variance 
test in case of normality where, following our notation, n is the number of treatments involved. The 
difference between Bradley’s result and our own result (8 N) is due to the fact that Bradley assumes & 
one way classification for the analysis of variance with 1 observations in each class instead of the balanced 
incomplete block design which we have considered. Since the variance o? in (8) clearly refers to the 
within block variability, the latter model seems to be more appropriate. 

If we multiply Bradley’s result by 1/C, where C is given by (4) with k = 2, 1C being the effective number 
of replications of the balanced incomplete block design, we get (8.N). 

As far as the y?-test is concerned, instead of the model 


F(z) = F(x+0,+7,) 











— @& ff At 





on set 


edom, 


(5) 


iew of 
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we could have considered the more general model 


F,,{2) = F(x +4,), 
where, for different , the F, wie) may be different cumulative distribution functions. Nothing much is 
gained as long as we are interested i in an incomplete block design, since the non-centrality parameter 
d?, and therefore the power of the y?-test, will depend on which particular k treatments are put in which 
particular block. However, in the case of Friedman’s test when k = n, easy computations show that 
12n 

dy oe: x [aera |S (9) 
where f,(%) = F(x). (9) can be used to compute the asymptotic power of Friedman’s test for this more 
general alternative. 

For k = n = 2, the y}-test is equivalent to the two-sided sign test, a test whose properties have been 
more thoroughly investigated than those of any other distribution-free test. (8) then becomes the 
asymptotic efficiency of the sign test relative to the t-test, a quantity which is usually given (e.g. Hodges 
& Lehmann, 1956) as E,, = 40°f%(0). (10) 


The explanation is that (10) refers to the one-sample sign test while (8) refers to the two-sample sign 
test, a distinction which is rarely made. (10) can be used for the two-sample case if f is interpreted as 
the density of x—y, where x and y are the two chance variables under consideration. From a practical 
point of view, it is more important to know the efficiency in terms of the individual distribution of x 
and y. Then the answer is given by (8). As far as the two-sample sign test is concerned, it is, of course, 
easy to obtain (8) directly without reference to the y?-test. 

The results obtained in this paper are asymptotic, i.e. valid for large numbers of replications. Nothing 
seems to be known about the relative efficiency of the y?-test for small numbers of replications except 
in the case k = n = 2. Since in this particular case the relative efficiency for a small number of replications 
is known to be much higher (in the case of normality) than indicated by the asymptotic formula, it seems 
reasonable to assume that the asymptotic values derived in this paper can be considered minimum values 
for the corresponding efficiencies. 
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A note on the application of Quenouille’s method of bias reduction 
to the estimation of ratios 


By J. DURBIN 


Research Techniques Division, London School of Economics 


1. An ingenious device for reducing estimation bias from O(n-1) to O(n-*) has been proposed by 
Quenouille (1949) and has been considered further in a recent paper by the same author (Quenouille, 
1956). The simplest form of the device is as follows. Suppose wo have an estimator t, based on n observa- 
tions whose bias is cn-!+O(n-*), where c is a constant. Let ¢, and t, denote the estimators calculated 
from the two samples of 4n observations obtained by dividing the original sample into two equal parts. 
Then the estimator ¢ = 2¢, — }(t,+¢,) has bias of order n-? at most. 
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It might, perhaps, have been expected that this reduction in bias could only be achieved at the 
expense of a corresponding increase of variance. However, in his 1956 paper Quenouille showed that any 
such increase in variance is of small order in m compared with the variance itself. The purpose of the 
present note is to demonstrate the rather surprising result that for an important class of estimators 
Quenouille’s device actually reduces the variance. The estimators concerned are ratio estimators of the 
formr = y/x where theregression of y on z is linear and where z is normally distributed. We use asymptotic 
expansions taking terms in n~ as far as n~*. Later on we consider other ratio estimators for which 
x has a I’-distribution and show that although for these estimators the variance is slightly increased the 
reduction in bias is such that the mean-square error is reduced. This is an exact result for any sample 
size. The implication of these findings is that Quenouille’s device should frequently lead to an improve. 
ment in the accuracy of estimation, particularly in sample surveys where ratio estimators are very 
commonly employed. 


2. Suppose 2 is a normal variable with variance O(n-1). Choose the units of measurement so that 
E(x) = 1, and let x = 1—£. Denote V(x) by h. For sufficiently large n we have 


E(a—) = H(1+€+é?+...). 
Taking the first four non-vanishing terms we find 


E(x-) = 1+h+ 3h? + 15h? + O(n-4). (1) 
Similarly, E(a-*) = H(1+2€+3&?+...) 
= 14+ 3h+ 15h? + 105h3 + O(n-*). (2) 


From now on we neglect the terms of O(n-‘). 

Suppose that r = y/z is a ratio estimator of p = E(y)/E(x), where the regression of y on z is linear of 
the form y = «+fx+u, and where the variance of u is a constant d of O(n-). Then p = «+f. Also 
E(r) = aH(x-1)+ 8. Using (1) the bias is therefore 


a(h + 3h? + 15h’), (3) 
which is O(n-*). 
Now r = £+(a+4u)/x. Consequently 
E(r—B)? = (a? +6) E(a-*) 
= (a* +68) (1+3h + 15h? + 105h3) (4) 
using (2). From (3) and (4) we find 
V(r) = a(h + 8h? + 69h) + 5(1 + 3h + 15h? + 105h8). (5) 
3. On splitting the sample into two halves we have the ratio estimators 7, = y,/x, and 1, = Yq/y, 
where y = Hy, +ys) and x = }(x,+2,). Suppose that y, = a+fx,+u, and y, = a+ faxy+u, where 
E(u, | x;) = 0 and E(u? | x,) = 26 (¢ = 1,2). Then u = $(u, + ug). 
Quenouille’s estimator is 


t= 2r— Hr, +12) 


o > agi 4 UjytUg l/uy Us 

~orel- stall ta ae te: 
Since V(x,) = V(x) = 2h we find on replacing h by 2h in (1) and (2) E(x>1) = 1+ 2h+ 12h? + 120h8 and 
E(x;>*) = 1+ 6h + 60h? + 840h3 (¢ = 1,2). Thus 


BS Bf) 1 
E(t—f) = aH {--—-|—+— 
eee ( ata) 
= a(1— 6h* — 90h). (6) 
Thus the bias is «(6h? + 90h’), which is O(n-). 


2 1/1 s 
Now Bi-—— ~+=) =E 43 =+2)+3 2,2, 3 
iw 2\a, a x? x\e, 2) 4\x% af 2,2, 


= (5+7 (+a) 7/1 
~~ lat 4\a8 a8] 2 \a,2, 


= 4(1+4+ 3h + 15h? + 105h3) + 4(1 + 6h + 60h? + 840h3) — 3(1 + 2h + 12h? + 120h3)? 
= 14+h—8h?— 168h?. 
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Uy tug _ 1 (ea, a)" of ae ee eS 
am LSS 1-2 16-2) +G-a) | 





2 
whence p{atts_* (24%)! = 28 {+7 (a+3)-za} 
= 26(4+h+ 4h? + 54h’). 
Consequently, E(t—B)? = «(1 +h — 8h? — 168h3) + d(1+ 2h + 8h? + LOSh®). (7) 
From (6) and (7) we find 
V(t) = V(t—f) = a%(h+ 4h? + 12h3) +0(1 + 2h + 8h? + 108h3). (8) 


Comparing (5) and (8) we see that in spite of the fact that, for sufficiently large n, ¢ has a smaller bias 
than r, V(t) is smaller than V(r). 


4. The error entailed by using the first few terms of the expansion of (1—£&)-1 in place of 1/2 is 
I/e—(1+é+...+§7) = &/x. The expected value of this is approximately 105h*. The error in the corre- 
sponding expansion for 1/2? is 1/a*—(1+2€+...+8’) which has an expectation of approximately 
945h*. Now h is the square of the coefficient of variation of x. Even for a coefficient of variation as large 
as 20% we have 945h4 < zt. Thus one would expect the errors arising from the use of the expansions 
to be small even for quite moderate values of n. 


5. Let us now consider the case where the regression of y on x has the same form as has been stipulated 
above, but where x now has the [-distribution with density x”-1exp(—2x)/I'(m). It follows easily that 


1 1 1 l 
2(;) a FS »(5) ~(m—1)(m—2)" 


Taking x = 4(a,+2,) as before, and supposing that 42, and 42, are independent Gamma variables with 
parameters 4m, we have 


1 1 1 1 a 
»(=) *--¥ »(3) “—ae-e © > 


E(r—£)? = te. 


a 
mai  @m= (m=) 


h E(r-—f) = 
Thus we have (r—£) a 


a. ee. 
m—1 m_ m(m—1)’ 


(9) 


Consequently the bias of r is 


and the mean Square error of r is 
Mi =H P 
(r) (- Pee f) 


2 2 
‘ E(r— p)*—— E(r—-B) + 











= a Se cree! ER: PN 
~~ \(m—1)(m—2) m(m—1) mf” (m—1) (m—2) 
oe a?(m + 2) é 
= m™\m—1)(m—2) * @n—1)(m—2)" es 
As before, Quenouille’s estimator is 
— 2 1/1 .1 Ut, 1 (uy Ue 
ta heal -a(E ta) ts ale ts) 
_  am—3) 
so that E(t—) = (m—1)(m—2)' 
bs 4 2a 
Thus the bias is ~ m(m— 1) (m— 2)" (11) 
2 1/1. 1\)2 .s75 WEA 
_ af -iGtall ~2latiqta) a} 


2 
wa a{att_* 1% = 208 {o+7 4+3)-= 
x 2\a, 2 a 4\a2% 28 
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4 1 7 
_— it a —— 1) (m—2) * Xm—2) a cere 
2 1 2 
+20{ oe oe 
a?(m* — 8m + 19) d(m? — 7m + 18) 





~ (m—1) (m—=2)*(m—A4) » (m=1) (m= 2)? (m= 4)" 
Consequently the mean square error of ¢ is 
: 2a a? 
M(t) = E(t— p)—-—~- Eit—f) +—, 
_ (m8 — 5m? + 12m+ 16) 6(m? — 7m +18) is 
= in*(m—1)(m—2)*(m—4) * (m—1)(m—3)"(m—4)° (12) 
Comparing (10) and (12) we have 
M(r)—M(t) = 








a?(m— 16) 6(m— 10) 
m(m—1)(m—2)2(m—4)  (m—1)(m— 2)? (m—4)° 





(13) 


Thus the mean square error of ¢ is certainly less than that of r provided that m > 16 and might be less 
for other values of m between 10 and 16. Now in this formulation x has mean m and variance m. Thus 
the coefficient of variation of a is m+. We conclude that whenever the coefficient of variation of z is 
less than }, which will be satisfied by all except the most inaccurate estimators, Quenouille’s estimator 
has a smaller mean square error than the ordinary ratio estimator. This is an exact result for any sample 
size. Coefficients of variation of greater than } will generally result from smallness in sample size. For 
such cases an alternative method suggested by Quenouille (1956) will often be practicable and might 
well lead to a reduction in mean square error. The method can be described by considering the case of 
an estimator ¢ based on n univariate observations. Let t,, ...,¢, denote the estimators obtained by omit- 


n 
ting the first, second, ..., nth observations in turn. Then nt —(n— 1) > ¢,/n has bias of lower order thant 


1= 
and variance of the same order. Developments of this idea could be worked out for different types of 
ratio estimators but the matter will not be pursued here. 
The analysis for ratios with I’-type denominators has been carried out in terms of mean square error 
since the variance of t is found to be greater than that ofr. Of course, an estimator which has both smaller 
bias and smaller mean square error than another is naturally preferable. 


6. It may be of interest to quote the results obtained by applying the method to samples of two from 
the population of four pairs (2, y) used by Goodman & Hartley (1958) to compare a number of other 
types of estimators of the population mean of the y’s. The population values are (2, 2), (2, 6), (4, 6), (6, 10). 
Quenouille’s estimator 7, and the four estimators considered by Goodman and Hartley, together with 
mean square errors, are: 


Estimator Mean square error 
— _ Tf2AYitYe) 1/41, Ye 
= | +2, 2 ,* he ies 
7 
jax YtYe2 0-917 
24,+2%, 
~_ 7 (Yi, Ye 
— {45% 2-407 
Ht 
7 (v1, Y2\ , 3 1 (Y%: Ye 
fa Je aS mee AS 0-563 
y i(4+2 +7 YitYe 9 ,* te (% +2) 
Y = (Y¥1+Ye) 2-667 


In these formulae (2x,, y,) and (a2, y,) are the sample values and } is the population mean of the z’s. We 
see that 7, has a smaller mean square error than any of the estimators considered by Goodman & Hartley. 


REFERENCES 
Goopman, L. A. & Hartitey, H. O. (1958). J. Amer. Statist. Ass. 53, 491-508. 
QUENOUILLE, M. H. (1949). J. R. Statist. Soc. B, 11, 68-84. 
QUENOUILLE, M. H. (1956). Biometrika, 43, 353-60. 
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On the probability integral transformation 


By C. L. MALLOWS 
University College London 


1. Let 21, 2g, .-.,%, be nm random variables with a common probability density function f(x, 6,, 2, ..., 95) 
depending on s parameters, where 
F(x, 91, 49,...,9;) > 0 (a<a<b). (1) 
If the parameters are known, the probability integral transformation defined by 


my 
a= | F(t, 0,,94,:..,0,)dt (a<2,< 6b, ¢=1,2,...,2) 
a 


defines n random variables 2,,2....,2, which are independent and are rectangularly distributed in 
(0,1). However, if we estimate the parameters by means of functions 


Dkk Lite cocsCals cocy Mi Adagicacylad 


of the observed values (which functions we may assume are not functionally dependent), and define 
n new random variables by 


x 
¥,= I S(t, Fz, Fg, ---) F’,) dt 
a 


=g(t.F,Fa.:42,), say (@< 2; <b, ¢= 1,2,...5), 
then in general these variables will not be independent, nor will they be rectangularly distributed. 


2. David & Johnson (1948) have remarked that the form of the joint distribution of the y’s will 
depend on the form of the distribution of the x’s, and may depend even on the values of the parameters 
in that distribution. Barton (1956) discussed these results while investigating Neyman’s y?-tests when 
the null hypothesis is composite, and Chernoff & Lehmann (1954) and Watson (1958) have considered 
related problems with reference to y?-tests of goodness of fit. In these cases, the typical result is that the 
asymptotic distribution of the criterion considered is that of the sum of a y? variable and one or more 
weighted squares of unit Normal variables; there is a partial loss of degrees of freedom relative to the 
case where the parameters are known. Barton shows by example that the amount of this loss may 
depend on the form of the null law, the particular estimators used, and even on the values of the 
parameters. 

As a contribution to the study of the small-sample case, in the present note we show that the pheno- 
menon observed by David & Johnson, that in some cases there are functional constraints between the 
y’s, is not at all general. On the basis of particular cases studied, David & Johnson conjectured that the 
rank of the Jacobian J (see (2)) will in general lie between n—s and n. It will be shown that only in 
special cases will the rank be less than n. Notice that this loss of dimensionality does not correspond to 
the loss of degrees of freedom mentioned above. 





3. Let Fin = Gol Veo Fg, .-05 Fg) =f (Lis F's «005 F's), 
7) 
Gir = Gr( is F, aeg g) ig ar, 9 Py soegl's)y 
oF, 
F,,= » Gy= oe 
Ox; Gio 
for r=1,2,...,8; *+=1,2,....n. 


(by (1), gi > 0 for a < x; < b). Then we have 


oy 8 ; 

Fe = 90+ D GirFre (i= 1,2)---4M), 

v5 r=1 

oy : a. 1-3 
‘= >> Gir Ps (4,9 = 1, 2, «09% t+ 7). 

0x; yal 


The Jacobian matrix J of the transformation is given by 


(Yrs Yoo -++9 Yn) 
O25 Xqy «0-9 Ly) 


|J| = 








n 
= TT 90|1.+GF, (2) 
1= 
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where I,, is the unit n x n matrix, G is the n x s matrix (G;,) and F is the s x n matrix (F,;). Consider the 
partitioned (n+ 9s) x (n+s8) matrix 
re (‘ ntGF 0 ) 


0 I, 
where I, is the unit s x s matrix. We have 


PJ) = p(\I,+GF) = p(A)—s. 


si I, 0) (In @) (IntGF 0\(In 0) (In -@\_(In 9% \_p gp 
‘ F 1)\o 1, o otl-ris\co 1£,)"\o £40q0°”CCU 


All the multipliers on the left-hand side of this equation are non-singular; hence 
n—p(J) =n+s8s—p(A) =n+s8—p(B) = s—p(I,+ FQ). 
In general, J,+ FG is non-singular, and so p(.J)=n; only when some special relationship exists between 
f and the estimators {F,} will p(.J) be less than n. We remark that the form of this relationship does 
not seem to have anything to do with maximum likelihood. 
If p(J) = n—8, we have p(I,+ FG) = 0, i.e. 
I,+FG=0 
and conversely. Two lines of development for this case are now available. We may study either: 
Problem (a), to find which forms for f(x;,6,,...,9,) satisfy these s* equations for given estimators 
WD oy voug lt gf OF 
Problem (b), to inquire what estimators F,...,, should be used in any given case (i.e. given G;,) 
to satisfy these equations. 


4. Problem (a). Suppose s = 2. For p(J) to be n— 2 we require 





14 © 20% Lo, 5 22271 _ 9, 
i=1 Jin OX; i=1 Jin OX; 
% on 2F:_ 4 14% 9aFs_ 
i=1 Gin O%; i=1 Gio OX; 
Now if, for example, 
1 


n 
F,=%, F,=8s=- > (x%,-%)? (v=norn-}), 
i=1 


v 
1 2 a 
Py= Py = 5 (%—2) 


we find the most general forms for G,;, and Gj. are 





Ue Se _%—# 
IJio ’ Jw 28 
In order to find the most general form for g, we must now solve the simultaneous partial differential 
equations aa a 
‘9, 9 _» 
dx 20, 
—0,\ ag @ 
= L bad 4 = 0, 
20, | dx 20, 


the general solution of which (Piaggio, 1928, § 142) is 
r—0, 


(v, 6,,0,) = n( 
g 1>“3 9, 
For the case s = 1 we find in a similar fashion the solutions 
g(x, 0) = h(g(0)(x—9)) —_(h, arbitrary), 


W(x): g(a, 0) = h( P(A) (Y(x)—9))  (h, ¢ arbitrary), 





) (h arbitrary). 
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For the case s = 3 with 


(x,;-Z%)’: g(x,0) = n( +10) (h, @ arbitrary). 
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there is no possible form for g which will make p(J) = n—3; it is, however, possible to reduce the rank 
by 2. We find the most general solution for which p(J) = n—2 to be given by g = h(A) where 





A = A(u,w) = const. 
is an integral of du +(o,(w) + ud,(w) + u*d,(w)) dw = 0, 
where u = («—0,)/04, w = 04/64 and h, ¢,, ¢, and ¢, are arbitrary functions. 
5. Problem (b). If s = 1, and g = h(x—8@), we require 
n OF 
1 de, 

This will be satisfied if F is any (differentiable) function satisfying 

F(x, 4+, %g4+, ...,%, +04) = F(a, 2%, ...5%_) +a (3) 


for all allowable x and a. We note that this is the standard condition for the statistic F to be a ‘measure 
of location’. 
If s = 2 and g = h{(x—9,)/,/0.}, we require 

















=< 5 ' oe A nt 
i=1 On; i=1 2F, 02; 
n OF 2 «2,—-F,0F 
-Y5*=0, 1-y Ste 
i=1 OX; i=1 2 Ox; 


These equations are satisfied if F’, satisfies condition (3) and also F'(ka,, kag, ...,kt_) = kFy (x1, Xq, -.-5%n), 
wile F, ata F (2+, %g4+, ...,U_, +0) = Fy(X4, Wg, ..-5 Ln), 
F (kx, kate, ..., kX) = K*I (2 ,, We, ..+5Xq) 


for all allowable x, a and k. We note that these are the standard conditions for ,/F, to be a ‘measure of 
dispersion’. 

6. The above particular solutions cover a large proportion of the usual cases; there are, however, 
numerous important distributions which do not fall into any of the above forms; for example, the y 
(or x?) distribution where the index parameter is estimated from the mean of the sample. In this case 
AJ) =n. 
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Extrema of quadratic forms with applications to statistics 


By K. A. BUSH anp I. OLKIN* 
University of Idaho and Michigan State University 


Introduction and summary. In many statistical problems, one is frequently confronted with the 
determination of the extremum of a quadratic form or a ratio of quadratic forms, occasionaliy with 
some restrictions on the space. In this paper we consider a number of such problems: (i) stratified 
sampling, (ii) a moment problem, (iii) discrimination, (iv) canonical correlations, and show how the 
solutions may be obtained as a consequence of two inequalities. Standard results on Hotelling’s 7", 
multiple correlation, and reliability may be similarly obtained. 


* This work was supported in part by the Office of Ordnance Research, U.S. Army, while the author 
was on leave of absence at the University of Chicago. 
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Some examples. In what follows the notation H(z) and V(«) will denote the mean and variance, 
respectively. 

(i) Stratified sampling. Let X,,,...,Xyy,5 Xa ---»Xeayys ++» Xe +++» Xewy Nit-.- +N, =N, be a 
finite population consisting of k strata, with mean X = XX;,,/N and intra-stratum variances §?, 
i= 1,...,4. A simple random sample z,,, soey Bing (i = 1,...,%) is chosen and an unbiased estimate of X 


win b 
is given by %y = =N,z,/N, 
where %; is the sample mean of the ith stratum. The variance of %,, is given by 


k N28? k N,S? 
V(z = é ss t ¥ 
@a) 2 N*n; ~ N? 








The problem of optimum allocation is to determine 7,,...,n, so as to minimize V(%,) subject to the 
restriction a,n,+...+a,n, = c. If we let 


7 = (14, .--97%3)) %=N,S/N, D, = diag. (a,,...,4,), D, = diag. (n,,....%), ¢=(1,...,1): 1xk, 


hen th blem is to find 
a eee MinrD>1r’ such that eD,D,e’ =c. 


n 


(ii) A moment problem. Karush & Wolfsohn (1955) consider the problem of determining the minimum 
distance from the origin to a point w = (w,,...,w;) in k-dimensional Euclidean space subject to the 
restrictions 


k k k 
v= 1, > ww, = 3 > dF w,=1, 
1 1 


for some m < k—1. If we let e = (1,...,1): lx m+l1, 


re. # 
Bp=u[P 2 Be) 
1™ Qm km 


then the problem becomes that of finding 


Minww’ suchthat wB’ =e. 
w 


(iii) Discrimination. Let « = (a,...,2,) follow a p-variate normal distribution with mean vector 4 
and covariance matrix &. Consider a linear combination of the variates, d = w,2%,+...+Wy%,y = wv; 
then E(d) = w6’ and V(d) = ww’. The classical procedure is to choose the weight vector w so as to 


(Ed)? _ (w6’)? 
_— Vid) wiw’* 


(iv) Canonical correlation. Let Yj, ...5 Yq: U1) +++»Ly follow a (p+q)-variate normal distribution with 
mean vector zero and covariance matrix 
S= *y my, 
Xie Lee 


where ,, = cov (y, y), Ly, = cov (y, x), Ugg = cov (x, x). Consider linear combinations 
dy = UY +--+ UgYqr 


dy = V4 2+... +Uy%5, 


and choose the weight functions u and v to maximize the correlation coefficient between d, and d, 


We note that Hd,d, = uX,,v’, V(d,) = uX,,u’, Vid.) = vd,_.v’, so that the problem is to find 


UX19V" 
Max ° 
U, 6 V(ud,,u’) V(vX 42’) 
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Two inequalities. A perusal of the problems indicate that they are of two types: 
(I) Find MinwAw’ such that wB = a, where w: 1 x p, A: px p is positive definite, B: px k, p>k 
w 


is of rank k. 


- 


= nl , where A: p x p is positive semi-definite, B: p x p is positive definite. 





wBw 
The usual solutions may involve Lagrange multipliers, direct differentiation, or some geometrical 
argument. The following inequalities permit solutions to (I) and (II). 
Inequality 1. If M is positive definite, then 


(II) Find Max 
w 


(xy’)* < (xMz’) (yM-ty’), (1) 
with equality holding if and only if <M = ay, where «@ is a scalar. 

Proof. Since M is positive definite, there exists a factorization M = LL’, where L is non-singular. 
Using the Cauchy inequality (wv’)? < (uw’) (vv’) and the correspondence u = xL, v’ = Ly’, we obtain 
(1). Equality holds if and only if wu = av, i.e. if ~M = ay. 

Alternatively we can now show that (1) is a special case of the generalized Cauchy inequelitv win 
states that in a linear space with inner product we have 


\(f,9)| < If Il- lig. 
where ||f || = (f.)# is the norm. 


The space of all n-dimensional vectors is linear and the inner product may be chosen as xy’. Then if L 
is any n X n non-singular matrix and x and y vectors, we get 


u=aL, v=y(L-), (u,v) =2y’, 
||ee|]| = eDL’a’ = 2M’, 
lol] = y(L>y Ly’ = yM-ry’, 





to obtain (xy’)? < (wMzx’) (yM-y’). 
By making the correspondence w = x, A = M, aB’ = y, we obtain the bound in I, 
P (wBa’)? (aa’)? 
> = : ’ 
(wAw’) > CB’ A-1Ba’ ~ aB’A-1Ba’ ) 


with equality holding if and only if wA = aaB’. Because of the constraint we also have 


(aa’) 
= wB = aaB’A-"B, =- 
a=w aa a = ar A-1Ba’ 
so that equality holds if and only if am 
= 4 Aw 
+" Faw * 


Inequality 2. If A is positive semi-definite and B is positive definite, then 





; < Om, (2) 
where 6,,...,0, are the roots of |A —0B| = 0, 0,, = min (,,...,9,), Oa = max (9,,...,,). 
Proof. There exists a non-singular transformation w = uQ such that 
wAw’ = uQAQ’w = uDgu’, 
wBw’ = uQBQ’w’ = uw’, 
where D, = diag. (0,,...,9,) and 0,,...,0, are the roots of |A —0B| = 0. Hence 


= wAw uDeu' el 
< = ——  < Om. 
=" wBv uw’ 





We note that if A is of rank 1, then A = a’a, and there is one non-zero root of |a’a—0B| = 0, namely 
aB-'q’, a fact which is apparent from (3) below. Thus in this case the two inequalities yield the same result 
(wa’)? 


Bw’ < aB-q’ = Om. 
wBw 
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At this point we can solve (i)—(iv). Problems (i), (ii) are solved using (1); (iii) may be solved using (1) 
or (2); (iv) is solved using (2). 
(i) (rDz +r’) (eDt D,, Die’) > (rDte’)*, gives the lower bound, with equality holding if and only jf 
eDt D,, = ar. Because of the restriction, the constant 


= (eDtD, Die’) —¢ 











rDte’ sr Dt e’”’ 
crD-* 
and hence (14, «++» My) = ED, = arDTt = rDhe’”’ 
_ ON; S;//a; 


7s =N,S; Ja; 
(ii) The solution is given directly by (1’) with A = I, a = e: 1xm+1, and B replaced by B’: 





,— (m+1)? 
> ’ 
eBB’e 
_ 1 
~ eBB’e’ 


(iii) Either (1) or (2) leads to the bound 02-0’. 
(iv) Using (1) or (2) to obtain the first inequality and (2) to obtain the second inequality, we have 


with equality holding if and only if w (eB). 


(uX,.v’)? UXy_ Ugg? Dia’ 
= & —— << Oy, 
(UX, U’) (VXg2r’) Udy, U 


where 6, is the maximum root of |©,. 2351 Li2—92,,| = 8. 
We note that the expression aA~1a’ occurs as the solution of some of these problems, and we obtain 


an alternative expression 
P |M+2’y| = |M|(1+yM-'2’), 


where M is non-singular and x, y and 1 x 7 row vectors. This follows from the fact that x’y is of rank 1, 


that 
— |\I+M-2z’y| = 1+tr(M-—2’y) = 1+yM-2’. (3) 
A , 
Thus if z = y = a, we have asa’ = | aed. 
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On certain properties of power-series distributions* 


By C. G. KHATRI 
M.S. University of Baroda 


INTRODUCTION 
1. Noack (1950) has defined a power-series distribution as 
P(é=2)=a,Z*/f(Z) for x=0,1,2,... (1) 


ao 
wherea,Z* > 0, a, is a function of x or constant, and f(Z) = >) a,Z*is convergent for |Z| < r. Here we 


Pom 

establish the recurrence relations for cumulants and factorial cumulants, which are utilized to show that 

any power-series distribution in a single parameter is determined uniquely from its first two moments. 

The multivariate extensions of these results are shown with an illustration of multinomial distribution. 
All the above results extend to truncated power-series distributions. 


* Read at the Second Reunion gathering (18 January 1959) of past and present students of the 
department of Statistics, University of Bombay, Bombay. 
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2. Relation between factorial lants and « lants. The rth factorial-moment is defined by 
B&(E—h) (E— 2h)... (E-—rh+h), ie. the coefficient of 0"/r! in the expansion of E(1+0h)§". Hence, the 
factorial-moment generating function is 

F(0) = E(1+60h)§". (2) 
Note that as h > 0, F(0) > $(0) = E(e%), (3) 


the moment-generating function. The relations between the moment-generating function and factorial- 
moment generating function are 





F(0) = 6; 1081 +0n)} (4) 
and $(0) = F{(e%—1)/h}. (5) 


Hence the relation between the factorial-cumulants and the cumulants can be established. If x; is 
the rth factorial cumulant and x, is the rth cumulant, then 


Ky >k, if h->0. (6) 
3. Properties of power-series distributions. 
(3:1). If xj, is the jth factorial cumulant of the power-series distribution (1), then 


ky = 20(2-4-F) log fZ) 


and Kian = zs —jh Kip, 
where (2-1 zz) = (zs a) (z- htt zz): ..j times. 
Proof. It can be verified by induction that 
(1+6h)" ae pr PIA(1 +Oh)¥"} = (z- att ag xz) @{Z(1 + Oh)¥*} 
and so, with f(w) defined by (1), 
Fog f(Z(1 +Oh)¥*} = (1 +0ny-*209(a- att xa) log f{Z(1 + Oh)**}. (7) 
From (1), it is easy to see that the factorial-moment generating function F(@) is f{Z(1+ 0h)""}/f(Z) and so 


ao 
Kip] = (sqrloe r(0)| ’ 


a 
= {— uny\ 
Ftoestz1 +h) } (8) 
Hence from (7), it follows immediately that 
Ka an(az- mS ) log f(Z). (9) 


Differentiating (9) with respect to Z, we shall have the recurrence relation stated above. 
CoroLuarRy 1. If h = 0, we have the relations for cumulants as 








d\i _ 7 Ak;_y 
kK; = (2) logf(Z) = ZF. (10) 
CoroLuary 2. If h = 1, we have the relations as 
dkiy_ : 
= ZF (5- VW ky-- (11) 


(3-2). The power-series distribution is uniquely determined from its first two cumulants (or moments). 
Proof. Let the power-series which determines a distribution be given by f(Z) and the first two cumu- 
lants be given in the parameter p where Z = g() is unknown. Then from the cumulant relation (10), 


d dl 
_— Ky = Z a7 log f(2Z) = Ftogy(r) [8 267, where y(p) =f(9(P)). 








ua «= 29h de, [tog 
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Therefore a a kK, = Feet (12) 
Hence log Z = {(G/s) dp+c, = logg,(p)+loge,, say, 

i.e. Z=9(P) (13) 
and log y(p) = [(« a: |") dp+c,=logy,(p)+loge,, say, 

i.e. YP) = C2yi(P)> (14) 


where ¢;, C2, C, and c, are constants. 
Let a, be the coefficient of {g,()}* in the expansion of y,(), then c,c;” a, will be the coefficient of Z? 
in the expansion of y(p). Hence, the probability that the random variable £ takes the value z is 
C201" a, Z*/y(p) = a.{9i(P)}*/yi(P)- (15) 
Hence, without loss of generality, we can assume in equations (13) and (14) that c, = c, = 1, ie. 


cmon (5) 


and f(Z) = exp {| («Set /) ay| ‘ (16) 


Now suppose that any two distributions have x, and x, equal, then by the expressions derived in 
(15) and (16), the two distributions are equal. Hence we have proved the theorem. 

Note that two successive cumulants of a power-series distribution other than the first two cumulants 
cannot determine the distribution uniquely. 


4. Illustrative examples. 


(a) Negative binomial distribution. 
Let f(Z) = (1—Z)-" for0 <Z<1,n>0,Z=p/(1+p) = p/g, p > 0. Then 











_  I(n+a) 
PE = 2) = ago fi = pried 
(€ = 2) ziT(n) ?? or «=0,1,2, (17) 
We have from cumulant relation 
Ky a gia an aa 
dZ dp 
and so kK, = np, Ky = 1pq, K; = npq(q+p) and kK, = npq(1+6pq). Conversely, if x, and x, are given, 
we have from (16) 1 p 
Z = exp (|= ar) =— 
q 
and f(Z) = exp (| ar) =q" =(1-Z)-*. 


Hence we have the negative binomial distribution. 
(b) (i) Pruncated negative binomial distribution truncated at x = 0. 
Here f(Z) = (1-—Z)-"—1lfor0 <Z<1,n2>0,Z=p/q,p>0. Then 
D(n +2) p*q-* 
a!I(n) q"-1 
In this case also, we have k, = pq(dk,_,/dp), and so 
kK, = npq"/(q"—1), Kz = mpqr(q"**—q—np)/(q"—1)*, ete. 
Conversely, from k, and k,, we can obtain Z = p/q and f(Z) = q"—1, i.e. f(Z) = (1—Z)-"—1. 
(ii) Particular case of (i). In (i), let nm > 0. Then we shall have 
k,=p/logqg and xk, = p(qlogq—p)/(logq)*. (19) 
From these cumulants, we have 


dp\_p dp 
Z=ex ({2) =2 and /(Z) = ex ({ 72) =10 , 
i, | > al f(Z) rl chee &q 
i.e. f(Z) = —log(1-Z). 
Hence the distribution is the logarithmic series 


P(E = 2) = 





for et, 3, «... (18) 


P(& =z) =p%q-*/xlogq for «=1,2,.... 
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fi 
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(20) 
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(c) Positive binomial distribution. 
Let f(Z) = (1+Z)", Z > 0, n is a positive integer, Z = p/(l1—p), 0 < p < 1. Then 





PeE=2z)= (") pl—p)"-* for 2«=0,1,...,n. (21) 
lant relation is Kp = 7th = p(l am 
The cumulan i 7= 25 = p(l—p) >” 
and so K,=np and k,=np(l—p). 


The converse problem is similar to (a) and (6). 
(d) Poisson distribution. This can be treated in the same manner. 
5. Multivariate extension with multinomial distribution as an illustration. 
Let F(Z, 299-24) = DY ag,,..., 0p Zi ZF... Zee 
Dy oeey Lk 
2, may be a function of 2, ...,7, or constant and 


Zp... ZF > 0. 


be a convergent series such that a,, 


ay 


42 coos CE 
Then we shall define the probability of the random variables £,,...,&, taking for values 2,,...,2,, 
respectively, as P(E, = 2) .--0 Ey = Zp) = Oe, ..., 0g 20 --- DUD; --- Zp) (22) 
(A) Example. Let f(Z,,...,Z,) = (1+2Z,+...+2Z,)", where n is a positive integer, 
k+1 
Z:=Pi/Pra <p, <1, x p=. 
i= 
n! 
Then P(E, = 4, -+-y Se = Le) = ——— ——_ ppi' --- pe* pieti's 
Uy reer Uy Uys: 
k+1 
where > %3=n (@=0,1,2,...,2). (23) 
i=1 


This is a positive multinomial distribution. 


Let Kjy,,...,74) be the factorial cumulant of the distribution given in (22). Then it is easy to show that 
k o\" 
Ktr,, wa II zim( zane =| log f(Z,, tees Z x); (24) 
i=1 04; 
OK irs, ...,7) 
or AS COR eS eS 2,5 hiker. rx 


If h, = 0 for all 2 in (24), we can at once reach the corresponding relations for cumulants. 
Example. Let 
k+1 
f(2,, 2%, ---,Z,) =(L4+4,+...+2,)", where Z;=p;/Pru, OS p; <1, x p=. 
Peas 


We shall here only write down the means, variances and covariances; other cumulants can be written 
down from the recurrence relations. 


k 
E(é,) = nt] (1+ 2; 2) =p, for <= 1,2:...:,%, 
j=1 
é 
Ve = 47 E(é;) =np{l—p, for +=1,2,...,k 
3 , 
and cov (§;, 5) = 2,7 ElE;) = 25-7 E(é,)=—np,p; for i+j=1,2,...,k. (25) 
i Aj 


Now consider the converse problem of obtaining the distribution from means, variances and covariances 
given in parameters (%,, Ys, ..-, ¥;,) if the distribution is of the power-series type. 
Let Z; be the function of (y,, Yo, ..-,Yx), 7 = 1, 2,...,k. Then 





7) k oy, @ 
(é,,£,) = Z,—— BE) = Z, F — — Be, 
cov (Ew Ei) = Zz BGs) = 2, & oe BUG) 
é kK dy, @ 
= Z,— E(é;) = Z; — — HE; 
‘OZ, (&;) 2 a7, dy, (&;) 


fori +7 = 1,2,...,k and 
kK ay @ 


V(é;) = 4:2 aZ, by, E(é,) (¢=1,2,...,k). (26) 
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From the set of equations (26), we can obtain the solution for Z,(@y,/@Z;), as a function g; (yj, ...,y,) 
(i,t = 1,2,...,%). Then we have 


dZ dZ ’ ; 
dys = Gin - Gt +I YZ for i= 1,2,...,&. (27) 
1 k 


From the above set of equations, we can obtain dZ,/Z; or (1/Z;) (@Z,/dy,) in terms of functions of 
(Y1s Yos «++» Yx) and hence obtain Z,;. After this operation, we can express E(é;) in terms of Z,, Zs, ..., Z, 
and obtain the solution from Z,(@/0Z;) log f(Z,, ...,.Z,) = E(&,;). Thus we can determine the power series, 
and so the distribution. 

For example, let the means, variances and covariances be given as (25). Then 











op; op 2 
cov (£,,£;) = —np;p; = nhs, = nbie by the use of (26), 
‘ Op; - Op; = . = 
i.e. 4555, = 29g, = PP; for ++j7=1,2,...,k. 
a Opi _ “oe 
Similarly Z; ag. pdl—p,) for i=1,2,...,k. 
dp; dZ, dZ, dZ,_, adZ, dZ 544 dZ;, 
H — = - 2) > - Po --- — Di- 1—p,;)———- Pp; 1.5 — Pe 28 
ence ra PZ Paz, Piz +(1—p;) Zz, PG ‘TF, (28) 
To obtain dZ,/Z; in terms of dp,/p; (t = 1, 2,...,&), we have to obtain the inverse of 
1-p, —P, +--+ —Pr 
—-P, 1-P, —Pr = I—-1ip’, 
—P; —pP, «.. 1l—p, 


where p’ = (P, Po, --+» Px), 1’ = (1, 1,..., 1) and J isa kx k identity matrix. We can easily show that 


k 
(Z-1p’)? =I+1p [Ppa Pes = 1- 2 Ps 
j= 
Thus the solution is 
1 aZ; a) 1 aZ, 
—— = {1+ p, and ——=l1/p for ¢+t=1,2,...,&. 
Z; Op; Pr+i Z; Op, site 
Hence we have log Z; = log p;/p441, 1.0. Z; = Pi/Pp44- Krom 


k é 
E(é;) = nd] (14 } > z,) = Z,~> logf(4%, ....Zx), 
j=l OZ; 





k 
we obtain log f(Z, ... Z,) = nlog (1 +> z,) +-0(Z 1, ..+5Zg-19 Zigays +0+9Zp)s (29) 
j=1 
where c(Z,, ...,Z;_1, Zj43, ---» Z,) is the constant of integration. Since (29) is true for all 7, we must have 
k k n 
log f(Z,,...,2,) = nlog (: +> Z,) 5 80. JB. 83) = (1+ ~. z,) : 
j=1 j=1 


Hence, we shall have the probability of the random variables £,, £5, ...,, to take the values 2, 9, ...,%, 
as given in (23). 
(B) Negative multinomial distribution. 


This is generated by [(Z,, Z,, ....Z,) = (l—Z,-2,—...—Z,)-", where 
k k 
2,= | (1+ ¥ 1). 0<Z,<1, DY 2Z;<1 and p,20. 
j=1 j=1 
Th k 
en r(n+¥ 2, ky 
j= e (. k -|[n+ x) 
Pls = Hy --oby = Oa) = erin) PAI + dn) i=“) for all 2; =0,1,2,.... (30) 


We shall have means, variances and covariances as 
E(é;))=np, V(b) =np{1+p,;) and cov(g,,£;)=np,p; for i+7 =1,2,...,k. 
The converse result can also be obtained in the same manner as the above positive multinomial one. 
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REVIEWS 


The Advanced Theory of Statistics. Volume I, Distribution Theory. By Maurice 
G. KenDALL and ALAN Stuart. London: Charles Griffin and Co. Ltd. 1958. Pp. 
xii+ 433. 84s. 


The 16 years which have elapsed since the publication in 1943 of vol. 1 of M. G. Kendall’s The Advanced 
Theory of Statistics have witnessed a remarkable, indeed a rather bewildering output of text-books on 
statistics of very varied quality. It is therefore most satisfactory that Professor Kendall, with Mr Alan 
Stuart’s help, has already completed part of the process of bringing his pioneer contribution up to 
date. To maintain the same comprehensive standard, three volumes will be required in place of two; 
the first was published in the summer of 1958 and the second is promised in 1960. 

Broadly speaking, the first 13 chapters of the new vol. I correspond to the first 11 chapters of the 
original vol. 1. The increase in number results from dividing the old chapter 7 on Probability and Like- 
lihood into chapters on The Calculus of Probabilities and Probability and Statistical Inference, and 
the splitting of chapter 11 on Approximations to Sampling Distributions into two. 

The expansion of the work into three volumes has no doubt made it possible for the authors to think 
out again the order and form of presentation of the later material; as a result the present volume closes 
with three in place of the original five chapters; of these, chapter 14 on Order-statistics and 15 on 
The Multivariate Normal Distribution and Quadratic Forms are new, while the final chapter 16 on 
Distributions Associated with the Normal contains material drawn together from elsewher2 in the 
original vols. I and 11. 

The changes made in turning the old chapters into the new are of two kinds. In the first place there 
are additions of entirely new material made necessary by advances in statistical theory and improve- 
ments in its techniques. Thus we find, for example, accounts of R. A. Fisher’s logarithmic distribution 
with an example (§§5-16, 5-17); of the log-normal distribution and of N. L. Johnson’s Sy and Sz 
frequency curves (§§6-27—6-35); a reference to uses of the method of steepest descents (§11-13), and 
a brief account of the method of deriving sampling moments systematically with the help of tables 
of symmetric functions, such as those recently compiled by F. N. David and M. G. Kendall (§ 12-5). 
But apart from the introduction of new material, it is evident that the authors have reconsidered the 
detailed presentation throughout and where they thought that this would clarify the argument have 
made alterations by slight re-arrangement of the text, by addition of paragraphs and by improvement 
of the illustrative examples. This process of improvement has been helped by changes in typography 
—the main section headings are now in bold instead of italic type, the tables are headed in bold type 
and printed in a more pleasing manner and the algebraic formulae are now displayed more compactly. 

The first doubling of chapters mentioned above has permitted an improved presentation of the 
brief introduction to the ideas of probability, likelihood and inference which was inserted in the original 
plan at this half-way stage to give point to the development in vol. 1 of what is primarily a mathe- 
matical theory of statistical distributions. Thus the concept of a random variable is now treated in 
greater detail (§§7-12—7-19). It is interesting to note the authors’ conclusion, summed up in the 
following sentences (p. 180), that there is no unique best method to relate a mathematical theory to 
the way we think: 

‘It seems, however, that every man must choose for himself and that his psychological make-up, 
his experience and his fields of interest all determine the kind of axiomatization which he prefers. 
In statistics it is a mark of immaturity to argue overmuch about the fundamentals of probability 
theory.’ 

In the last sentence the intended emphasis is presumably on ‘overmuch’ and the authors are 
thinking of that kind of dogmatic assertion which claims that one method of approach is right and the 
others wrong. They can hardly mean that there should not be continued interest in the fundamentals 
nor fail to realise that when two or three people discuss such matters keenly there is bound to be 
argument ! 

The second doubling of chapters, 11 into 12 and 13, makes possible some re-arrangement and expan- 
sion in the handling of sampling moments and sampling cumulants, including the sampling from finite 
populations. As now presented, the second of the two chapters deals with bivariate k-statistics and 
the proofs of the combinatorial rules stated in the earlier chapter. 

The chapter on Order Statistics (14), besides incorporating the sections on the distribution of mth 
values and of range which previously appeared in the chapter on Standard Errors, contains a good 


31-2 
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deal of new matter. The new chapter 15 on The Multivariate Normal Distribution and Quadratic 
Forms introduces the vector and matrix notation which presumably will be taken up again in vol. m 
where the application of methods of multivariate analysis is to be included. : 

The original vol. 1 closed with a mixed series of chapters in which, besides the development of the 
necessary theory, some straightforward numerical illustrations were given of the uses (a) of the y? 
test, (b) of bivariate correlation and regression, and (c) of other measures of association. All this illu- 
stration of application has now been removed, presumably to be included in the later volumes, and 
instead we have here brought together in a single chapter on Distributions Associated with the 
Normal (16), the mathematical properties of and approximations to the distributions of the standard 
normal-theory statistics y?, t, F and z, r and b. There are obvious advantages in this arrangement 
which makes available in one place the distribution theory of the tools required for later application. 

The volume closes with 12 single-page tables, 10 dealing with the normal distribution, y?, t, F and z 
and two giving (i) augmented symmetric functions in terms of power-sums up to weight 6 and (ii) 
multiple k-statistics in terms of augmented symmetric functions up to 6th order. The last two tables 
are linked with the discussion in chapter 12. There is also a list of References to the authors quoted 
and a 14 pp. Index. It is a pity that here (as well as in another recent book on statistics) the name 
Gosset has been spelt with two ¢’s! 

The Exercises left for solution by the reader have been greatly increased in number so that there 
are now about 20 at the end of each chapter. This should prove a valuable addition for the serious 
student. 

It is of course impossible at present for a reviewer to survey this great undertaking as a whole, 
because its successful rounding off must depend on the way in which the later volumes on Statistical 
Inference and Statistical Relationship (2) and Statistical Planning and Analysis, and Time-Series (3) 
are built up on the present volume on Distribution Theory. With a project of this magnitude there are 
inevitably alternative methods of arranging the order of the parts. In dealing with statistics and pro- 
bability the writer of an expository text is faced with the problem of ordering in a single dimension, 
subjects whose links are essentially multi-dimensional. From the teaching point of view it is instructive 
in itself to see how the present authors have attacked their problem. 

The decision on the broad plan was taken by Professor Kendall nearly 20 years ago, but the changes 
already appearing in the revised vol. 1 suggest that this plan is being modified to some extent. The 
removal of much of the numerical illustration from the later chapters, while retaining a generous use 
of such illustration in the first few chapters appears a sound policy. A more difficult matter for decision 
must have been where, how and to what extent to introduce the ideas of inference and random sam- 
pling. The placing of an account of Bayes’s Theorem and Maximum Likelihood with a discussion on 
randomness and bias in practical sampling in the middle of vol. 1 is undoubtedly in some ways anoma- 
lous. So, too, is the inclusion in this volume of tables of percentage points of y?,¢, F and z before the 
reader has been shown how to use them. But if it is remembered that ultimately all three volumes will 
be available together so that the reader can choose at will his own linking and cross-linking of the 
parts, criticism on such points is hardly valid. 

It remains only for the reviewer to remark that he awaits the appearance of vols. 11 and 11 in pleasant 
anticipation and that at the same time he wishes there were a solution to the problem of how the 
advanced student is to find means of purchasing his own copies of the complete series at a total price 
which is likely to approach £12. E. S. PEARSON 


Planning of Experiments. By D. R. Cox. New York: John Wiley and Sons Inc.; 
London: Chapman and Hall Ltd. 1958. Pp. vii+ 308. 60s. 


Let us define our terms. A statistician (i.e. a real statistician) is a person experienced in handling data, 
in all the various stages from planning to presentation (decision-making is a further step which usually 
involves non-statistical considerations). A mathematical statistician has some fluency in various tech- 
niques involved in these stages, and is usually actively engaged in extending these techniques, or in 
inventing new ones. A statistical mathematician studies various idealized problems arising out of the 
experience of real statisticians—he is not immediately interested in real data. All three classifications 
may apply to one individual at different times. The statistician of most immediate value to a commercial 
employer is the real mathematica! variety. 

It is a common and valid criticism of many university departments and standard texts that they omit 
sufficient mention of the many difficulties sure to be encountered in applying the mathematical tech- 
niques in real life, and offer little guidance in these matters. Thus new graduates very seldom qualify 
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to be called real statisticians, and it usually takes them some time to supplement their formal training 
with practical experience. 

Dr Cox’s book is extremely welcome in that it concentrates on just those matters where little guidance 
is available elsewhere. The practical aspects of experimental design are here dealt with in a highly 
individual way—the discussion avoids mathematical technicalities, yet is careful and detailed; it is 
mostly in terms of concrete examples, taken from many different fields. It is intended primarily for the 
private reading and reference of the experimental workers, and requires no specialized knowledge for 
its understanding; thus for instance the details of the analyses of variance involved receive only incidental 
mention. 

Of course, there is no substitute for experience; and Dr Cox is careful to point out that his examples 
have often been simplified so that the principle under discussion might more clearly be brought out; 
references to the original accounts have been given wherever possible. Nevertheless, the book goes 
much farther than is usual in either elementary or advanced texts in bringing out the difficulties, and in 
expounding general principles. Its relation to such works as that of Cochran and Cox is not so much 
introductory as complementary. 

The first nine chapters deal with basic concepts and the key designs; the last five touch briefly on some 
more advanced topics. The concern is solely with comparative experiments so that, for example, it does 
not cover the planning of surveys. 

In summary, this is a book about real statistics. It is easy to predict that it will have considerable 


success; it should be made required reading for all students of statistics. 
C. L. MALLOWS 


Theory and Methods of Scaling. By Warren 8. Torcrrson. New York: John Wiley 
and Sons Inc.; London: Chapman and Hall, Ltd. 1958. Pp. xiv+460. 76s. 


In 1950 the American Social Sciences Research Council appointed a committee to study the relative 
merits of different methods of scaling; and Dr Torgerson was chosen to review and summarize, under 
the general guidance of the committee, the material available. He begins with a theoretical examination 
of the different types of scientific measurement, and then proceeds to classify the various techniques for 
metrical scaling which have been elaborated during recent years. After discussing the alternative criteria 
that could be used for such a classification, he develops a working scheme based on ‘the different ways 
in which the variations of the subjects’ responses may be allocated’. This leads him to distinguish three 
main groups. 

The first group of procedures consists of those which attribute the variations in the subjects’ responses 
to individual differences in the subjects themselves: this he terms the ‘subject-centred approach’. 
It is the principle underlying most of the work on mental testing. But, largely because it has been so 
fully discussed elsewhere, Dr Torgerson has decided to exclude it from his own review. The second group 
adopts what he calls the ‘stimulus-centred approach’. Here the systematic variations in the subjects’ 
reactions are attributed to the way the stimuli vary in respect of some specific attribute. The majority 
of the well-known psychophysical methods fall into this group; and these with their latest modifications 
are discussed in considerable detail. Separate chapters are devoted to methods based on the ‘law of 
comparative judgement’, the ‘law of categorical judgement’, and to the theory of multidimensional 
scaling—a topic to which Dr Torgerson has himself made important contributions. 

The third group adopts the ‘response method’. Here the variability in the subjects’ reactions is 
ascribed partly to variations in the subjects and partly to variations in the stimuli. This, as Dr Torgerson 
points out, is the most general method of all; but, owing to the complexity of the problems involved, 
the methods devised are relatively few and comparatively recent. He recognizes two main subclasses— 
‘deterministic models’, of which Guttman’s is the best known example, and ‘probabilistic models’, 
which include Lazarsfeld’s method of ‘latent structure’ and Coombs’s ‘unfolding technique’. 

All the more important procedures falling under each of these several heads are lucidly explained with 
detailed illustrations; the underlying theories on which they rest are critically examined; and their use 
and limitations impartially discussed. Much of the material included has hitherto been available only 
in isolated articles or specialist text-books. Hence the volume will be invaluable to every investigator 


h ei “isis : . Re il 
who seeks to apply quantitative techniques to social or psychological problems ouiei, tiie 
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Linear Programming: Methods and Applications. By Sau I. Gass. New York 
and London: McGraw-Hill Book Co. Inc. 1958. Pp. 223. 50s. 6d. 


This book differs from most other books on Linear Programming in that it contains an extensive formal 
exposition of the underlying theory. It answers in this respect a real need (even if it is not the first to 
attempt it, as the publishers’ blurb claims). The text had been prepared for an academic course, and is 
thus addressed to an audience—and readership—with well-defined and predictable mathematical 
knowledge. For British readers the level of sophistication is perhaps best defined by mentioning that 
matrix notation is used throughout, but that about 20 pages are devoted to the formal theory of matrices, 
convex sets and simultaneous linear equations. 

After this, there follows an equally readable part 1—Methods: Theoretical and Computational. 
This contains seven chapters, all of subjects related to the Simplex Method; it includes a chapter on 
Parametric Linear Programming (due partly to the author himself), which deals with a minor but 
important aspect of programming under uncertainty. There is, also, a survey of available automatic 
programmes for digital computers, necessarily and admittedly already incomplete. 

However, this is not the only feature which makes the book appear somewhat dated. Stochastic, 
non-linear, and discrete Linear Programming are mentioned only casually, clearly because some newer 
methods were unknown when the book was conceived. But even regarding the foundations there exist 
now—at any rate in the reviewer’s opinion—methods of higher didactic value than those presented by 
the author. He has followed here closely the originai publications, whose importance is, in many cases, 
merely historical. Even so, the purpose of making the matter clearly understood is achieved, and this is 
praise which some other books on the same subject have failed to deserve. 

When we come to part 111—Applications, we find the author less successful. He gives here a rather 
scrappy list of descriptions of Linear Programming in practice, but it is impossible for a reader who does 
not know the examples already to guess how some of them could be put into Linear Programming form. 
(This applies, with particular force, to the Paper Trim Problem, on p. 182, and to the Travelling Sales- 
man Problem, on p. 186.) It is true that the book contains a bibliography of applications, grouped by 
subjects such as agricultural, industrial, military, etc., but too many of the titles refer to reports of 
U.S. Government agencies, even where their text has since appeared in readily available periodicals, 
e.g. Management Science or Operations Research. 

The last chapter contains a clearly and well-written description of two-person zero-sum games and 
establishes their connexion with Linear Programming. 

The proof reading has been done very carefully : only very few misprints survived (e.g. coeds for codes 
on p. 130; also, the enumeration on p. 6 should be one of alternatives, not of simultaneous features). 

In spite of the objections raised above, the book is a very useful contribution to the explanatory 
literature, and an essential item on the shelf of any Operational Research worker who is interested in the 


foundations as well as the applications of his subject. — 


Wahrscheinlichkeitsrechnung und Mathematische Statistik (Probability and 
Mathematical Statistics). Second edition. By M. Fisz. Berlin: VEB Deutscher 
Verlag der Wissenschaften (Hochschulbiicher fiir Mathematik Band 40). 1958. 
Pp. 528. DM. 14. 


The first edition of this book (Rachunek prawdopodobienstwa i statystyka matematyczna) was reviewed 
in Mathematical Reviews, 1955, 16, 492 by Z. W. Birnbaum. This second edition, much more extensive 
and complete, appeared in 1958, simultaneously in Polish and German. 

The book consists of two parts: the first is concerned with the theory of probability, the second with 
statistical theory, in particular the theory relating to inference from sample values to the universe from 
which the sample has been drawn. In the first five chapters the author discusses basic ideas: random 
events, the axiomatic theory of probability, random variates and their distributions and characteristic 
functions in one- and multi-dimensional spaces. The sixth chapter is devoted to the more advanced 
applied fields, and clear intuitive interpretations of many of the mathematical formulae help to make this 
book easier to read. It should be noted that the author uses the expression ‘the most efficient estimator’ 
for the estimator with minimum variance instead of the more usual definition ‘the efficient estimator’. 
The book contains accounts of recent work in the theory of probability and statistics and should be 
included among the modern text-books on these subjects. 

It is well printed and the type is easily read. It is not expensive for its size and should be useful both 


y 3) ¢ ’ sg ° 4 
as a general reference book and as a student’s text-book. REGINA C0. ELANDT 
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On the Dynamics of Exploited Fish Populations. By R. J. H. Beverron and S$. J. 
Hott. (Fishery Investigations, series 1, volume xIx.) London: H.M.S.O. 1957. 
Pp. 533. £6. 6s. 


It is a pleasure to bring this book to the notice of statisticians and biometricians. The authors, one from 
the Fisheries Laboratory at Lowestoft and one from the Fisheries Division of the Food and Agriculture 
Organization at Rome, set out to describe in mathematical language the processes of fishing and seek 
from this description to predict what the effect of changing methods might be. Part 1 is concerned with 
‘Fundamentals of the theory of fishing, illustrated by the analysis of a trawl fishery ’. A preliminary model 
assuming a steady state in a fishery is described and the model is then distorted to allow for other charac- 
teristics. In part 11 extensions of the theory of fishing are discussed. We are introduced to the effect on 
the fish population of egg production, natural mortality, gear and fishing intensity, and growth and 
feeding. 

Now it is clear that the setting up of mathematical models will involve the estimation of the para- 
meters involved in these models. This estimation is thoroughly discussed in part 111, while part Iv gives 
the effect on the model of possible variations in the parameters and attempts prediction. Part Iv in 
fact tries to predict what would be the outcome of attempting to regulate fishing effort and net-mesh 
size. The authors come to the conclusion that the ‘best’ results—‘ best’ meaning increased catches, 
increase in profits and saving in the overheads of men and labour—will be obtained if the fishing effort 
is restricted to half that of the years before 1939, or less, and if the traw] nets have an enforced 80-90 mm. 
mesh size. 

The statistician will be quite clear that to set up mathematical models is not enough: it is also necessary 
to verify that the model fits the observed data. The authors are also aware of this necessity. They do not 
in the course of their book usually give the data on which their models are based. What they do is to 
refer to the published research work of themselves and others as justification. In the light of the length 
of the book in even its present state this seems an admirable procedure. 

The exposition is lucid and enjoyable. It can be reeommended to anyone tired of abstract theory and 


wanting real applications. F. N. DAVID 


Contribuito allo Studio delle Tavole di Nuzialita. By G. Panizzon. Padova, Italy: 
Cedam. 1958. Pp. 143. 1500 lire. 


This book is concerned with the construction of tables of the probability of marriage in Italy. The 
probability of marrying within a year, for a bachelor or spinster of any given age, is termed the ‘relative’ 
probability. Since an appreciable proportion of unmarried persons will die, emigrate or immigrate within 
the year, a correction has to be made for these factors before a valid comparison can be made between 
marriage rates at different epochs. The author therefore calculates a hypothetical ‘absolute probability’ 
of marriage which would result if migration or death could have been prevented. The ‘life tables’ based 
on these absolute probabilities are called ‘gross’ tables. The author shows how to obtain relative and 
absolute probabilities and net and gross tables from available demographic data, and gives new tables 
of his own based on the 1951/52 census, comparing these with earlier Italian work, and also with British 
marriage rates. The general conclusion is that on the average marriage is now postponed, at least for 
bachelors, to an appreciably later date than in 1900. The modal probability for males has shifted by 
about 4 years in the last 50 years: in females the difference is rather smaller, and the curve has become 
appreciably less skew. The book is illustrated by numerous tables and a few graphs. There is also a 


discussion of abbreviated tables (at, say, 5-year intervals) and rapid methods. CEDRIC A. B. SMITH 


Periodic Regression in Biology and Climatology. By C. I. Buiss. New Haven, 
Connecticut: Connecticut Agricultural Experimental Station. (Bulletin 615.) 1958. 
Pp. 55. Gratis. 


The author discusses the analysis of periodic data when the length of the cycle is determined indepen- 
dently, where the observations are spaced at even intervals and there is a constant number of replications 
at each point. Illustrations are given on monthly temperatures, iodine values for butterfat, hourly 
potential in an elm tree, monthly death rates from pneumonia, monthly incidence of poliomyelitis, 
hourly incidence of human births, reaction of toads to gonadotrophin and hourly heat exchange of 


cows. It will be noted that the bulletin is issued free on application. F. N. DAVID 
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Queues, Inventories and Maintenance. (Publications in Operations Research, No. 1.) 
By Pump M. Morss. New York: John Wiley and Sons Inc.; London: Chapman 
and Hall Ltd. 1958. Pp. 202. 52s. 


The authors of Finite Queueing Tables open their preface with the statement that ‘there are several 
text-books and monographs which discuss the theoretical aspects of queuing theory ’: nevertheless, this 
reviewer welcomes Professor Morse’s book as the first in English devoted entirely to this subject. His 
aim is modest: ‘The present volume does not pretend to be an exhaustive treatise on queuing theory, 
Its purpose is primarily expository, to present enough of the concepts, to define some of the terms, and 
to illustrate a few of the analytic techniques...’, and within this framework he is very successful. Apart 
from a short chapter on transient behaviour (which he dismisses too lightly: ‘Transient solutions are 
not usually of much practical importance’) he confines his attention to the steady-state solutions of 
systems whose inter-arrival and service-time distributions are related to the negative exponential. If 
he has one major fault it is the making of too general statements. Sometimes he is cautious (p. 140): 
‘We note that the probabilities are the same no matter what the distribution of replenishment times is 
(constant, or exponential, or hyper-exponential) ’—sometimes he is right (p. 98): ‘Eqs. (7-37) and (7°59) 
show that when the single service channel is exponential, state probabilities P,, form a geometric series 
no matter what the arrival distribution may chance to be.’ But at least once he is wrong (p. 87): ‘As 
with any single-channel, infinite-queue system, the probability that the system is empty is 1—p.’ In 
the single-channel system denoted (in D. G. Kendall’s nc w standard notation) by F,/E,/1 one can obtain 


Po = (1—p) {1-41 +p—V{(1+p)?+4p))}, 
which is not generally equal to 1—p. 

Technically the book is well produced and the number of misprints is small There is one point of 
presentation which could cause confusion to the unwary. On p. 47 the author numbers the phases of 
an E, arrival mechanism in the reverse order: on p. 84 he numbers them in the natural order: on p. 108, 
fig. (8-5) does not correspond to equation (8-6). The notation used for an E, service mechanism is 
consistent. 

At the end of a discussion of a maintenance model Professor Morse has written: ‘ All this could have 
been reasoned out qualitatively without recourse to the equations. By use of the equations, however, we 
can ensure quantitatively that we use the repair facilities in an optimum manner.’ This attitude per- 
vades the book, and although some of his arguments are a little untidy he has written a book which will 
be useful particularly to the practical man who wants to describe real situations, and useful also to the 
mathematician coming fresh to queuing theory who wants to know what it is about without being 


overwhelmed by rigour. 
D. M. G. WISHART 


Finite Queueing Tables. (Publications in Operations Research, No. 2.) By L. G. Peck 
and R. N. Hazetwoop. New York: John Wiley and Sons Inc.; London: Chapman 
and Hall Ltd. 1958. Pp. 120. 68s. 


On internal evidence the book by Professor Morse (reviewed above) appears to have been written before 
the book under review. It is a pity therefore that Doctors Peck and Hazelwood had not consulted the 
manuscript of Professor Morse’s book before preparing their own: its would greatly have helped the 
reader (and the two books being of a series they command essentially the same audience) had a uniform 
notation been adopted. The tables are concerned primarily with problems of maintenance where the 
population (N) of potential customers is finite. Two quantities, D and F, are tabulated as functions of N, 
M (the number of repair men), and a parameter X (which is related to the parameter x used by Morse 
in his discussion of the same problem by the formula X = 1/(1+.)). F is called the efficiency factor (the 
only inefficiency in the authors’ opinion appears to be a machine broken down and waiting for service): 
D is said to be the probability of a call for service being delayed (it is really one minus the probability 
that a call received immediate attention—in the case of an infinite population these are the same, but 
for a finite population we encounter the difficulty that if the entire population is broken down then 
there are no more calls to be delayed). As D. R. Cox has pointed out to at least one reviewer, the tables 
appear to be in error. Using formulae 11-12 and the tables in Professor Morse’s book this reviewer has 
derived D for M = 1, X = 0-5, and N = 3,4, 5; these values agree with D in Peck and Hazelwood’s 
tables for M = 1, X = 0-5, and N = 4, 5,6 respectively. Conjecture: for N read N — 1 throughout. 


D. M. G. WISHART 
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Handbook of Probability and Statistics with Tables. By R. S. Burineton and 
D.C. May. U.S.A. and London: McGraw-Hill Book Co. Inc. 1953. Pp. ix + 246+ 72. 
46s. 6d. 


This book containing 246 pages of expository material and 72 pages of tables is intended as a ‘cook-book’ 
for users of statistics rather than an instruction text for the embryo statistician. The idea has been to 
summarize formulae, definitions, theorems, etc., commonly met with in elementary statistics and 
probability theory, and to give short tables of the particular functions thus summarized. The topics 
covered are the binomial, Poisson and normal distributions, regression and time series, tests of signi- 
ficance and confidence intervals, elementary analysis of variance, finite differences and interpolation, 
and quality control. The tables include the binomial ,C,p%q"-* for n = 1(1)...20, 2 =0(1)...n, 
p = 0-:05(0-05) ...0-50, the summed binomial for the same set of values, the Poisson distribution 
e-™m*/x! for m = 0-1(0-1)... 1€, 11 (1)... 20, and the summed Poisson, the normal curve area and or- 
dinates and the first five derivaties, tables of F, z, t, and y?. There are also supporting auxiliary tables. 


ition is adequate without being inspired. 
The exposition i qu ithou g inspir ‘ik tie 


Elementary Matrix Algebra. By Franz E. Houn. New York: The Macmillan Co. 
1958. Pp. xi+305. 52s. 6d. 


The author has had considerable experience in presenting the elements of matrix algebra to varied 
groups of students, including statisticians; the present volume appeared in preliminary editions in 
1952 and 1957. The material has been very carefully arranged and graded. The chapter headings are 
Introduction to Matrix Algebra, Determinants, The Inverse of a Matrix, Rank and Equivalence, Linear 
Equations and Linear Dependence, Vector Spaces and Linear Transformations, Unitary and Orthogonal 
Transformations, The Characteristic Equation of a Matrix, and Bilinear, Quadratic, and Hermitian 
Forms. This final chapter includes a proof of Cochran’s theorem. There are appendices on the = and II 
notations, complex numbers, and isomorphism. 

The emphasis throughout is slightly towards abstract algebra—fields, groups, and vector spaces are 
defined in the course of the work—but these concepts are always carefully introduced and explained in 
concrete terms. There are many exercises for the student, some of which are marked as comprising 
results of particular importance. 

The author disclaims any pretensions to completeness, and detailed treatment of applications has 
not been attempted; from the statistical point of view it is unfortunate that he has not found space to 
mention the multivariate normal distribution, multiple regression, or to give the Helmert and hyper- 
spherical polar transformations explicitly. Nevertheless, this is an excellent book which deserves to 


become standard. 
Cc. L. MALLOWS 


Introduction to Functional Analysis. By A. E. Taytor. New York: John Wiley and 
Sons Inc.; London: Chapman and Hall Ltd. 1958. Pp. 423. 100s. 


Professor Taylor avowedly intends that this book shall help the beginner in Functional Analysis who 
starts with a modest background consisting mainly of a sound Honours Degree course in ‘classical 
Analysis’, the additional necessary preparation regarding Linear Spaces and General Topology being 
provided within the book. He has succeeded in attaining this object, the book being admirably suited 
to use as a text accompanying early postgraduate training. The subject-matter is clearly presented and 
well illustrated by examples, the choice of which is not limited to the orthodox (and by now somewhat 
dated) applications to sequence spaces and Summability Theory. Professor Taylor’s personal interest 
in applications to complex variable problems, for example, has provided a refreshing source of illu- 
strative examples. Other comparatively little known illustrations are provided by discussions of the 
functional analytic aspects of the Dirichlet Problem and Weyl’s ‘ Projection Method’ in Potential Theory. 
Ptoblems appear in fair abundance and cover a wide range of difficulty. The more difficult questions, 
often accompanied by helpful suggestions, must frequently succeed in stimulating the reader to continue 
the trend of thought introduced in the text. Most chapters begin with a summary which includes an 
allocation of rank to, and an indication of the relations between, the major theorems to follow. The 














498 Reviews 


allocation may not always be that which would be awarded by the specialist, but it is in accord with the 
aim of this book. This same remark applies to the occasional repetition of fundamental results resulting 
from the natural emphasis on normed linear spaces. 

Regarding the specific contents of the book, it is fair to say that, apart from the theory of Banach 
algebras, the major portion of the basic equipment of a functional analyst is provided. Chapters 1 and 2 
set the stage by dealing with Linear Spaces and General Topology respectively, whilst chapter 3 (Topo. 
logical Linear Spaces) begins the book proper. Some of the topics discussed in chapter 3 receive further 
attention in chapter 4; in particular, Duality Theory appears in both chapters. Chapter 4 also deals 
with Representation Theorems for linear functionals and operators, the deeper properties of linear 
operators (including that happy trio: the Homomorphism, Inversion and Closed Graph Theorems), 
and boundedness principles for sets of linear operators. Applications of the Riesz Convexity Theorem are 
also discussed. 

Chapters 5 and 6 are both devoted to Spectral Theory, the latter chapter concentrating on the Hilbert 
space setting. This highly developed field receives a treatment which is both clear and reasonably com. 
plete. It includes the Riesz theory for compact operators and the Spectral Decomposition Theorem for 
bounded self-adjoint and unitary operators; the case of unbounded self-adjoint operators is only briefly 
treated, details of proofs being omitted and references given. 

Chapter 7, entitled ‘Integration and Linear Functionals’, is perhaps the least satisfactory portion of 
the book. It provides a guide to existing accounts, without itself providing more than a few of the details. 
This reviewer suspects that satisfaction will be gained only by the reader who already has some knowledge 
of the topics under discussion, and for whom the chapter would be largely superfluous anyway. Other- 
wise full reward will be reserved (perhaps not unjustly) for the keen reader who is prepared to follow up 
the reading suggested. Abstract measure spaces, Borel measures and Radon measures are all dealt 
with, albeit rather summarily, as also are bounded, finitely-additive set functions and their integrals, 

The typography is excellent, and no misprints were found. 

One may confidently expect that this book will find a well deserved place on the personal bookshelves 
of many students and teachers of Functional Analysis, particularly so in the former case if it should be 


possible to publish a cheaper edition. 
R. E. EDWARDS 


Tables of Partitions. (Royal Society Mathematical Tables, Vol. 4.) By H. Gupra, C. E. 
GwyTHER and J. C. P. Miter. Cambridge University Press for the Royal Society. 
1958, Pp. xxxix+132. 63s. 


In spite of their title these tables do not list the partitions of numbers but consist essentially of an 87-page 
table (Table 1) of the function p(n, m), the number of partitions of n into at most m parts, for the ranges 
n = 1(1) 200, m = 0(1) min(n, 100) and n = 201 (1) 400, m = 0(1) 50. The integer-valued entry goes up 
to nineteen figures at its largest. The excellent and comprehensive introduction reveals how many other 
allied functions may be easily derived from this, as for instance the number of partitions of n with: 
greatest part at most m; exactly m parts; exactly m unequal parts; greatest part exactly m; exactly m 
unequal parts (including possibly a zero part); etc. The introduction also gives the exact formulae for 
p(n, m) for m up to 12 (of Herschel, Cayley, Sylvester & Glaisher) in terms of both circulators and prime 
circulators. Further, a short discussion of asymptotic formulae is given followed by a description of 
the method of computation of the tables and a bibliography. 

The remainder of the tables consist of: values of p,(n,m) [Table II, pp. 90-121]; p,(n, m) [Table IT, 
pp. 122-131] and p,(n, 0) [Table IV, p. 132], tabuiated over a complicated range of values not briefly 
to be expressed. These functions may be used successively to check and extend the earlier tables in 
virtue of such relations as 

‘ p(n,m) = > (—1)"' p,(n—rm— 4r(r+1),7), 
r=0 
p (n,m) = >) (— 1) p,(n—rm—4Fr(r+1),7r), ete. 
r=0 
These functions are correspondingly larger integers; the function p,(n, 0), the number of unrestricted 
partitions of n, being of thirty-two figures for n = 1000, for instance. They are generally defined by 


bed «o 
DY pln, m) t” = {A(t,t)}8 Y p(n, m) t", 
n=0 n=0 
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where A(t, t) = a —i)-1, Pn, m) t® = I] (1-—#)-1. 
f= n= r= 


The tables are not likely to be of great use to statisticians, who will largely be well satisfied with much 
shorter earlier tables, but it is very heartening for probabilists to have such extensive tables for refer- 


ence and use in the rare combinatorial problems which demand them. 
D. E. BARTON 


Table for the Solution of Cubic Equations. By H. E. Sauzmr etal. U.S.A. and 
London: McGraw-Hill Book Co. Inc. 1958. Pp. xv +160. 58s. 


The table gives the roots of the general cubic equation 
az +bxe*+cx+d=0 
by means of tabulating the roots of Ofs+f=1 
as functions f;, fo, fs of . It is left to the reader to compute the argument 
0 = 3(3a°d — abe + 2b3)?/(3ac — b?)* 
and convert the f-root back to the z-root by means of 
x = —}b/a—f(3ad — be + 2b3/a)/(3ac — b?). 


For |6| > 1 the argument is 1/0; otherwise it is 0 and the interval is 0-001. The whole range of 0 is thus 
covered. The entry is to 7 and 8 significant figures, and first and second differences are also given. When 
two of the roots are complex, as they are outside (— 4/27, 0), they are tabulated as #+i¥% (being con- 
jugate). This is in spite of the fact that — 2% equals the real root and the imaginary part is only slightly 
less easily derived. 

The introductory matter tells us that ‘linear or quadratic interpolation will yield full accuracy nearly 
everywhere’. It is suggested that the tables will be commonly useful in what amounts to inverse 
third-order interpolation, but the reviewer can conceive of no use for them at all which would justify 


paying the prohibitive price. ae weil 


Surveys in Applied Mathematics. New York: John Wiley and Sons Inc.; London: 
Chapman and Hall Ltd. 1958. 

Vol. 1. Elasticity and Plasticity. By J. N. Goopimr and P. G. Hopes, Jr. Pp. ix+ 152. 
50s. 

Vol. 2. Dynamics and Nonlinear Mechanics. By E. Lermanis and N. Mrnorsxy. 
Pp. xii+206. 62s. 

Vol. 3. Mathematical Aspects of Subsonic and Transonic Gas Dynamics. By 
LipmMAN Bers. Pp. xv+164. 50s. 

Vol. 4. Some Aspects of Analysis and Probability. By Irvine Kaptansky, MARSHALL 
Haut Jr., Epwin Hewitt and Rosert Forter. Pp. xi+ 243. 72s. 

Vol. 5. Numerical Analysis and Partial Differential Equations. By Grorar E. 
ForsyTHE and Pau C. RosEnBLooM. Pp. x+ 204. 60s. 

These survey articles have been written as a joint project of the Office of Naval Research and Applied 

Mechanics Reviews. They are aimed ‘...not so much at research specialists, actively contributing to the 

subjects discussed, as...at a broader, mathematically literate audience, looking for contemporary in- 

formation on the important problems and results in these disciplines’. A major objective is to present 

recent developments appearing in Russian journals. 

The casual reader should be warned that the degree of ‘mathematical literacy’ assumed is consider- 


able; it is doubtful whether any one person will have the energy, ability or inclination to digest the 
contents of all these volumes. Certainly your reviewer is not competent to comment on the first three. 
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The obvious temptation confronting the author of such a survey is to fail to give a balanced picture 
by placing undue emphasis on his own work; this temptation has not always been avoided. Also the 
articles will necessarily be out of date almost before they are published; in these volumes there are very 
few references later than 1956. 

The articles most interesting to statisticians per se are ‘A Survey of Combinatorial Analysis’ (Hall) 
and ‘ Recent Advances in Probability Theory’ (Fortet) in vol. 4, and ‘Contemporary State of Numerical 
Analysis’ (Forsythe) in vol. 5. The other articles in vol. 4, namely ‘Functional Analysis’ (Kaplansky) 
and ‘A Survey of Abstract Harmonic Analysis’ (Hewitt) may interest specialists in matters measure- 
theoretic. 

Hall’s article on combinatorial analysis (68 pp.) gives only three glancing references to Russian work, 
The emphasis throughout is entirely different from that of Riordan’s recent book (reviewed in the last 
issue of Biometrika); neither Hall nor Riordan give any references to the other’s papers. Hall has very 
little sympathy with methods using generating functions; his first section (Methods of Enumeration) 
suffers greatly from comparison with Riordan’s book. His second part is devoted to the theorem on 
distinct representatives, with its generalizations and some applications; the discussion in the final part 
(Existence and Construction of Designs) concentrates on the more elegant (and less practical?) parts 
of the theory. 

Fortet’s article (70 pp.) is admittedly selective, omitting much of the theory of stochastic processes. 
It includes summaries of recent work on axiomatic foundations, central limit theorems (including results 
on rapidity of convergence and limit theorems for densities), ‘general random elements’, functionals 
of random functions, and the Kolmogorov—Smirnov tests. This material is excellent, but will be rather 
too abstract for the taste of most British statisticians. 

Forsythe’s article on Numerical Analysis (40 pp.) gives many references to Russian work, and dis- 
cusses Russian computers (as in 1956). Topics covered in detail are numerical integration, approximation 
of functions, solution of linear equations, matrix eigenvalue problems, and difference methods for 
Laplace’s equation. 

It is difficult to imagine who is going to buy these books. Some large mathematics departments seeking 
to interest their students may find it worth while; but many potential customers who might have been 
attracted by one or two articles will be put off by the necessity of buying also uninteresting material 
which doubles or quadruples the cost. 








Cc. L. MALLOWS 
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OTHER BOOKS RECEIVED 
Statistics of Extremes. By E. J. GumsBet. New York: Columbia University Press. 1958. Pp. 375. 
$15.00. 


Statistics: an Introduction. By Donatp A. 8S. Fraser. New York: John Wiley and Sons Inc.; 
London: Chapman and Hall Ltd. 1958. Pp. 398. 54s. 


Information Theory and Statistics. By Sotomon Kutupack. New York: John Wiley and Sons 
Inc.; London: Chapman and Hall Ltd. 1959. Pp. 394. 100s. 


Statistical Estimates and Transformed Beta-Variables. By Gunnar Biom. New York: John 
Wiley and Sons Inc.; London: Chapman and Hall Ltd. 1958. Pp. 176. 40s. 


Die Axiomatischen Grundlagen einer Allgemeinen Theorie des Messens. By J. PFANZAGL. 
Wiirzburg, Germany: Physica-Verlag. 1959. Pp. 63. DM.14. 


General Systems. [Yearbook of the Society for General Systems Research, Vol. 3, 1958.] Michigan, 
U.S.A.: Braun-Brumfield Inc. Pp. 259. $7.50. 


Strategy and Market Structure. By Martin Sxusix. New York: John Wiley and Sons Inc.; 
London: Chapman and Hall Ltd. 1959. Pp. 387. 64s. 


Statistical Methods in Biology. By N.T.J. Battery. London: English Universities Press. 1959. 
Pp. 200. 258. 


Analysis of Straight-Line Data. By F.S. Acron. New York: John Wiley and Sons Inc.; London: 
Chapman and Hall, Lid. 1959. Pp. 267. 72s. 


Teoria della code. By G. Avonpo Boptno and F. Bramsitia. Milano: Istituto Editoriale Cisal- 
pino. 1959. Pp. 213. 


Elementary Decision Theory. By Herman CHERNOFF and Lincotn E. Mosrs. New York: 
John Wiley and Sons Inc.; London: Chapman and Hall Ltd. 1959. Pp. xv +364. 60s. 
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CORRIGENDA 
(1) Biometrika (1948), 35, pp. 157-75 


‘The power function of the test for the difference between two proportions in a 2 x 2 table.’ 
By P. B. Patnatk. 
I am indebted to Mr R. L. Turner for pointing out some errors in formulae on pp. 172-3 
of this paper: 


(a) p. 172, the expressions for h, and h, should read 
h, =k ic 2(1 — py — pa) V [8 Prfi~ P2d2)|— 3( P1941 + P22)| 
(Pi + Ps) (2— Pi — Po) 
hy = k In Pi 2(1—p,— py») y| 8m Pads - P242)|— 3( p19, + Pede)\ 
q | (p1 + Ps) (2-2, — P2) 
(b) p. 173, in equation (25) for (1+01—p,—p,) read (1+0(1—p,)—ps), an ex- 
pression which may also be written as 09, + qo. 


It seems probable that these errors crept in at some stage in the copying of my MS. 
as the numerical values for the second approximation to the power function given in the 
examples in Table 3 have been found to be correct. 

P. B. Patnaik 


(2) Biometrika (1958), 45, pp. 411-20 
‘Moment estimators and maximum likelihood.’ By L. R. SHENTON. 


(a) p. 415, 2nd line 
for N(vara,,,)~' read (N vara,,,)~}. 


(6) p. 415, 9th line, equation (18) 


for (Aa)* read A*a*. 


(3) Biometrika (1959), 46, pp. 1-29 
‘On the cumulants of renewal processes.” By W. L. Smrru. 


(a) p. 2, 3rd line from bottom 


Hs Ils 
for -—-5 read —-—,. 
Hy 2M 
(b) p. 18, 14th line, for \z(€)| read |Z,(€)). 


(c) p. 19, 9th line from bottom 


for s-—\ read 1s. 
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